from:"Matthew Petach"

Re: Question about mutual transit and complex BGP peering

2024-04-22 Thread Matthew Petach

On Mon, Apr 22, 2024 at 7:35 AM Sriram, Kotikalapudi (Fed) via NANOG <
nanog@nanog.org> wrote:

> Requesting responses to the following questions. Would be helpful in some
> IETF work in progress.
>
> Q1: Consider an AS peering relationship that is complex (or hybrid)
> meaning, for example, provider-to-customer (P2C) for one set of prefixes
> and lateral peers (i.e., transit-free peer-to-peer (P2P)) for another set
> of prefixes.  Are these diverse relationships usually segregated, i.e., P2C
> on one BGP session and P2P on another?  How often they might co-exist
> within one single BGP session?
>
>
Every time I've been in relationships like this, the fundamental answer is
always "follow the money".

If there's dollars flowing relative to the "provider-to-customer"
relationship, but no dollars flowing along the "peer-to-peer" relationship,
you need a solid way to determine which bits are taking the zero-dollar
pathway, and which bits are taking the non-zero-dollar pathway.

Whatever means are available to positively distinguish the traffic on an
unambiguous basis that both networks agree on is what determines the setup.

In many cases, separate physical ports with separate BGP sessions (and
sometimes even separate VRFs) is the only way that both parties fully trust
all the right bits
are being accounted for in each case.

In other relationships, flow data is considered adequate to determine how
much traffic is zero dollar, and how much traffic is non-zero dollar.  In
that case, it can be a single BGP session, single port.

> Q2: Consider an AS peering relationship that is mutual transit (i.e., P2C
> relationship in each direction for all prefixes).  Is this supported within
> one single BGP session?  How often the ASes might setup two separate BGP
> sessions between them -- one for P2C in one direction (AS A to AS B) and
> the other for P2C in the opposite direction (AS B to AS A)?
>

This is just a variant of a normal peer-to-peer relationship, most likely
with a traffic ratio involved.
In most of these situations, as long as the traffic is within the defined
ratio, accounting for the
bits isn't worth it; sending a bill from A to B for $X, and a different
bill from B to A for $+$Y where $Y is
generally much smaller than $X is more headache than it's worth.
And once the ratio goes outside of the prescribed range, you're not really
mutual transit anymore, you're provider to customer,
and the only wrinkle is which one considers themselves the provider, and
which considers themselves the customer.
Witness Level 3 versus Comcast versus Netflix from years ago:
https://arstechnica.com/tech-policy/2010/12/comcastlevel3/
https://publicknowledge.org/netflix-cdn-v-the-cable-guys-or-comcast-v-level-3-part-deux-peering-payback/

Again--when everything is within ratio, and pipes aren't full, no need for
separate ports or separate BGP
sessions.

Once things start to fill up, though, then things get ugly.  That's when
different sessions come into play,
with some traffic being shunted to congested sessions, while the two sides
battle it out.

It still comes down to the same fundamental rule, though--follow the
money.   ^_^;

Thanks!

Matt

> Thank you.
>
> Sriram
> Kotikalapudi Sriram, US NIST
>

Re: N91 Women mixer on Sunday?

2024-03-29 Thread Matthew Petach

On Thu, Mar 28, 2024 at 11:17 PM Eric Parsonage  wrote:

> It's easily fixed by having a mixer at the same time for the other half of
> the gathering population thus showing all the population gathering matters
> equally.
>

I believe the mixer for the other half of the gathering population has been
going on for decades, and is generally referred to as "drinks at the hotel
lobby bar".
Just because it isn't called out by name doesn't mean that the male half of
the population hasn't been meeting and mixing and mingling already for
years.  ;-P

I'm with Randy Bush on this.  The stakeholders in that event should have
the say in what happens with it; not the rest of us.
Those of us old white males need to check our privilege, and recognize that
we've *been* having "mixers" for decades.
We don't need to put a stake in the ground and push for our equality; we've
already been on the beneficiary side of the
inequality for decades.

Matt

Re: Backward Compatibility Re: 202401100645.AYC Re: IPv4 address block

2024-01-12 Thread Matthew Petach

On Fri, Jan 12, 2024 at 2:47 PM Randy Bush  wrote:

> > Perhaps you are too young to realize that the original IPv6 plan was
> > not designed to be backward compatible to IPv4, and Dual-Stack was
> > developed (through some iterations) to bridge the transition between
> > IPv4 and IPv6? You may want to spend a few moments to read some
> > history on this.
>
> ROFL!!!  if there is anything you can do to make me that young, you
> could have a very lucrative career outside of the internet.
>
> hint: unfortunately i already had grey hair in the '90s and was in the
> room for all this, and spent a few decades managing to get some of the
> worst stupidities (TLA, NLA, ...) pulled out of the spec.  at iij, we
> rolled ipv6 on the backbone in 1997.
>
> randy
>

OMFG,  that just made my afternoon.  :D  :D
Someone calling Randy Bush "too young".

If Randy Bush is too young, the rest of us must still need our diapers
changed on a regular basis.  :P

Matt

Re: 202401100645.AYC Re: IPv4 address block

2024-01-12 Thread Matthew Petach

On Fri, Jan 12, 2024 at 2:43 AM Nick Hilliard  wrote:

> Matthew Petach wrote on 11/01/2024 21:05:
> > I think that's a bit of an unfair categorization--we can't look at
> > pre-exhaustion demand numbers and extrapolate to post-exhaustion
> > allocations, given the difference in allocation policies pre-exhaustion
> > versus post-exhaustion.
>
> Matt,
>
> the demand for publicly-routable ipv4 addresses would be comparable to
> before, with the additional pressure of several years of pent-up demand.
>
> You're right to say that allocation policies could be different, but we
> had discussions about run-out policies in each RIR area in the late
> 2000s and each RIR community settled on particular sets of policies. I
> don't see that if an additional set of ipv4 address blocks were to fall
> out of the sky, that any future run-out policies would be much different
> to what we had before.
>
> So 240/4 might last a month, or a year, or two, or be different in each
> RIR service area, but it's not going to change anything fundamental
> here, or permanently move the dial: ipv4 will still be a scarce resource
> afterwards.
>
> Nick
>

Hi Nick,

I participated in many of those pre-exhaustion policy discussions at ARIN
meetings;
at the time, I thought a hard landing would motivate everyone to simply
shift to IPv6.

Having lived through the free-pool exhaustion, and discovered that the hard
landing
concept didn't get people to move to IPv6, it just made the battle for IPv4
resources
more cutthroat, I've come to rethink my earlier stances on NRPM updates.  I
suspect
I'm not the only one who sees things differently now, in a post-exhaustion
world with
no signs of IPv6 adoption crossing the nebulous tipping point any time soon.

In light of that, I strongly suspect that a second go-around at developing
more beneficial
post-exhaustion policies might turn out very differently than it did when
many of us were
naively thinking we understood how people would behave in a post-exhaustion
world.

If we limit every registrant to only what is necessary to support the
minimum level of
NAT'd connectivity for IPv4, we can stretch 240/4 out for decades to come.
You don't
need a *lot* of IPv4 space to run 464XLAT, for example, but you *do* need
at least a
small block of public IPv4 addresses to make the whole thing work.  If you
limit each
requesting organization to a /22 per year, we can keep the internet mostly
functional
for decades to come, well past the point where L*o has retired, and Android
starts
supporting DHCPv6.  ;)

But I agree--if we looked at 2000's era policies, 240/4 wouldn't last
long.  I just think
that many of us have matured a bit since then, and would vote differently
on updates
to the NRPM.  ^_^

Thanks!

Matt

Re: 202401100645.AYC Re: IPv4 address block

2024-01-11 Thread Matthew Petach

On Thu, Jan 11, 2024 at 9:29 AM Tom Beecher  wrote:

> Christopher-
>
> Reclassifying this space, would add 10+ years onto the free pool for each
>> RIR. Looking at the APNIC free pool, I would estimate there is about 1/6th
>> of a /8 pool available for delegation, another 1/6th reserved.
>> Reclassification would see available pool volumes return to pre-2010 levels.
>>
>
> Citing Nick Hilliard from another reply, this is an incorrect statement.
>
> on this point: prior to RIR depletion, the annual global run-rate on /8s
>> measured by IANA was ~13 per annum. So that suggests that 240/4 would
>> provide a little more than 1Y of consumption, assuming no demand
>> back-pressure, which seems an unlikely assumption.
>>
>

Hi Tom,

I think that's a bit of an unfair categorization--we can't look at
pre-exhaustion demand numbers and extrapolate to post-exhaustion
allocations, given the difference in allocation policies pre-exhaustion
versus post-exhaustion.

If we limited ISPs to a single /22 of post-exhaustion space, with a minimum
1 year waiting period to come back to request an additional /22, 240/4
would last a good long time.
That aligns with ARIN's current NPRM initial allocation, post-exhaustion:
4.2.2. Initial Allocation to ISPs

All ISP organizations without direct assignments or allocations from ARIN
qualify for an initial allocation of up to a /22, subject to ARIN’s minimum
allocation size.

If you already *have* existing IPv4 space, I would propose you be
ineligible to apply to ARIN for space from within 240/4; you already have a
functioning business with some amount of IPv4 space, and can look at either
trying to be more efficient with what you have (more CG-NAT, renumber off
public space for internal links, etc.), or participating in the open market
for IPv4 space transfers.

240/4 can be made to last a very long time, if we apply post-exhaustion
rules, rather than allowing pre-exhaustion demand curves to continue
forward.

I share Dave's views, I would like to see 240/4 reclassified as unicast
>> space and 2 x /8s delegated to each RIR with the /8s for AFRINIC to be held
>> until their issues have been resolved.
>>
>
> This has been discussed at great length at IETF. The consensus on the
> question has been consistent for many years now; doing work to free up
> 12-ish months of space doesn't make much sense when IPv6 exists, along with
> plenty of transition/translation mechanisms. Unless someone is able to
> present new arguments that change the current consensus, it's not going to
> happen.
>

The key difference is that IPv6-only doesn't (currently) work,
transition/translation mechanisms require an entity to have at least *some*
IPv4 addresses to anchor their transition/translation mechanisms to, and
we've created a situation that presents significant barriers to entry for
new applicants that existing entities don't face.  At some point in the
near future, I suspect governments will begin to look at the current ISP
environment as anti-competitive if we don't adjust our stance to ensure a
fair and level playing field for new entrants as well as existing incumbent
providers.  I think we're going to need to ensure that new applicants are
able to get their initial allocation of space for the foreseeable future in
order to fend off increasing regulatory pressure.  Adding space from 240/4
to the initial-allocations-only pool would help ensure that.

>
> On Thu, Jan 11, 2024 at 5:54 AM Christopher Hawker 
> wrote:
>
>> There really is no reason for 240/4 to remain "reserved". I share Dave's
>> views, I would like to see 240/4 reclassified as unicast space and 2 x /8s
>> delegated to each RIR with the /8s for AFRINIC to be held until their
>> issues have been resolved.
>>
>> Reclassifying this space, would add 10+ years onto the free pool for each
>> RIR. Looking at the APNIC free pool, I would estimate there is about 1/6th
>> of a /8 pool available for delegation, another 1/6th reserved.
>> Reclassification would see available pool volumes return to pre-2010 levels.
>>
>> https://www.apnic.net/manage-ip/ipv4-exhaustion/
>>
>> In the IETF draft that was co-authored by Dave as part of the IPv4
>> Unicast Extensions Project, a very strong case was presented to convert
>> this space.
>>
>> https://www.ietf.org/archive/id/draft-schoen-intarea-unicast-240-00.html
>>
>> Regards,
>> Christopher Hawker
>>
>

Thanks!

Matt

Re: transit and peering costs projections

2023-10-15 Thread Matthew Petach

On Sun, Oct 15, 2023 at 9:47 AM Dave Taht  wrote:

> [...]
> The three forms of traffic I care most about are voip, gaming, and
> videoconferencing, which are rewarding to have at lower latencies.
> When I was a kid, we had switched phone networks, and while the sound
> quality was poorer than today, the voice latency cross-town was just
> like "being there". Nowadays we see 500+ms latencies for this kind of
> traffic.
>

When you were a kid, the cost of voice calls across town were completely
dwarfed by the cost of long distance calls, which were insane by today's
standards.  But let's take the $10/month local-only dialtone fee from 1980;
a typical household would spend less than 600 minutes a month on local
calls,
for a per-minute cost for local calls of about 1.6 cents/minute.
(data from https://babel.hathitrust.org/cgi/pt?id=umn.319510029171372=75
)

Each call would use up a single trunk line--today, we would think of that
as an
ISDN BRI at 64Kbits.  Doing the math, that meant on average you were using
64Kbit/sec*600minutes*60sec/min or 2304000Kbit per month (2.3 Gbit/month).

A 1Mbit/sec circuit, running constantly, has a capacity to transfer
2592Gbit/month.
So, a typical household used about 1/1000th of a 1Mbit/sec circuit, on
average,
but paid about $10/month for that.  That works out to a comparative cost of
$10,000/Mbit/month in revenue from those local voice calls.

You can afford to put in a *LOT* of "just like "being there""
infrastructure when
 you're charging your customers the equivalent of $10,000/month per Mbit to
talk across town.  Remember, this isn't adding in any long-distance charges,
this is *just* for you to ring up Aunt Maude on the other side of town to
ask when
the bake sale starts on Saturday.  So, that revenue is going into covering
the costs of backhaul to the local IXP, and to your ports on the local IXP,
to put it into modern terms.

> As to how to make calls across town work that well again, cost-wise, I
> do not know, but the volume of traffic that would be better served by
> these interconnects quite low, respective to the overall gains in
> lower latency experiences for them.
>

If you can figure out how to charge your customers equivalent pricing
again today, you'll have no trouble getting those calls across town to
work that well again.
Unfortunately, the consumers have gotten used to much lower
prices, and it's really, really hard to stuff the cat back into the
genie bottle again, to bludgeon a dead metaphor.
Not to mention customers have gotten much more used to the
smaller world we live in today, where everything IP is considered "local",
and you won't find many willing customers to pay a higher price for
communicating with far-away websites.  Good luck getting customers
to sign up for split contracts, with one price for talking to the local IXP
in town, and a different, more expensive price to send traffic outside
the city to some far-away place like Prineville, OR!  ;)

 I think we often forget just how much of a massive inversion the
communications industry has undergone; back in the 80s, when
I started working in networking, everything was DS0 voice channels,
and data was just a strange side business that nobody in the telcos
really understood or wanted to sell to.  At the time, the volume of money
being raked in from those DS0/VGE channels was mammoth compared
to the data networking side; we weren't even a rounding error.  But as the
roles reversed and the pyramid inverted, the data networking costs didn't
rise to meet the voice costs (no matter how hard the telcos tried to push
VGE-mileage-based pricing models!
-- see https://transition.fcc.gov/form477/FVS/definitions_fvs.pdf)
Instead, once VoIP became possible, the high-revenue voice circuits
got pillaged, with more and more of the traffic being pulled off over to
the cheaper data side, until even internally the telcos saw the writing
on the wall, and started to move their trunked voice traffic over to IP
as well.
But as we moved away from the SS7-based signalling, with explicit
information about the locality of the destination exchange giving way
to more generic IP datagrams, the distinction of "local" versus
"long-distance"
became less meaningful, outside the regulatory tariff domain.
When everything is IP datagrams, making a call from you to a person on
the other side of town may just as easily be exchanged at an exchange point
1,000 miles away as it would be locally in town, depending upon where your
carrier and your friend's carriers happen to be network co-incident.  So,
for
the consumer, the prices go drastically down, but in return, we accept
potentially higher latencies to exchange traffic that in earlier days would
have been kept strictly local.

Long-winded way of saying "yes, you can go back to how it was when
you were a kid--but can you get all your customers to agree to go back
to those pricing models as well?"   ^_^;

Thanks!

Matt

> --
> Oct 30:
>

Re: maximum ipv4 bgp prefix length of /24 ?

2023-10-14 Thread Matthew Petach

On Sat, Oct 14, 2023 at 2:37 PM John Kristoff  wrote:

> On Sat, 14 Oct 2023 13:59:11 -0700
> Matthew Petach  wrote:
>
> > That last report shows that only half of the top 1000 websites on the
> > Alexa ranking support IPv6.
>
> The Alexa ranking is no longer maintained.  ISOC had a recent article
> talking about just this:
>
>   <
> https://pulse.internetsociety.org/blog/do-half-of-the-most-popular-websites-use-ipv6
> >
>
> John
>

Good to know, John, though it doesn't change the underlying issue; as the
Oct 12th Pulse report from your link says,
"If we look at Figure 2, we can see nearly half of the top 1,000 websites
are IPv6 capable."

which basically says things haven't moved much since the last
Alexa-rank-based measurements were taken.

Thank you for the pointer to more-up-to-date data!

Matt

Re: maximum ipv4 bgp prefix length of /24 ?

2023-10-14 Thread Matthew Petach

On Wed, Oct 11, 2023 at 1:53 PM Mark Andrews  wrote:

> > On 12 Oct 2023, at 06:51, Delong.com  wrote:
>
> > The point here is that at some point, even with translation, we run out
> of IPv4 addresses to use for this purpose. What then?
>
> You deliver the Internet over IPv6.  A really large functional Internet
> exists today if you only have IPv6.  It is only getting bigger.  Lots of
> (the majority?) of CDNs deliver content over IPv6.  Lots of companies
> outsource their SMTP to dual stacked service providers so that email still
> gets through.
> After 20 years there is no excuse for ISPs failing to deliver IPv6.  If
> you have to you, outsource your NAT64, DS-Lite transition service to
> someone that has IPv4.  I’m surprised that it isn’t common today.

While you claim "there is no excuse for ISPs failing to deliver IPv6", the
reality is that many ISPs don't support fully functional IPv6 deployments.
There are far too many networks that allocate a single /64 for a wireless
customer, and ignore DHCPv6-PD requests.  There's a reason that IPv6-relay
functionality in OpenWRT is so widespread--because even when ISPs "support"
IPv6, they often do so poorly, leading to awkward hacks like relaying the
same /64 downstream through intervening routers.
I'm all for the eventual success of IPv6, but at the moment, we're really
not there.

But the bigger point is that there's still big chunks of the content side
that aren't reachable via IPv6.
https://www.6connect.com/blog/ipv6-progress-report-top-sites-2019/
http://www.delong.com/ipv6_alexa500.html
https://whynoipv6.com/

That last report shows that only half of the top 1000 websites on the Alexa
ranking support IPv6.  So we're a long way away from being able to simply
say "You deliver the Internet over IPv6."

Your last two sentences are exactly what I stated as a business proposition
earlier.  You said (fixing the typos):
"If you have to, you outsource your NAT64, DS-Lite transition service to
someone that has IPv4.  I’m surprised that it isn’t common today."

In a world where only half the content sites are reachable via IPv6, and
IPv4 address space is exhausted, that requirement to outsource NAT64
functionality is becoming more and more a business reality going forward.

Can you list any company today that provides an outsourced NAT64
translation service?
The only one I'm aware of is the one Kasper Dupont is running, and he's got
a very clear warning that it's not suitable for high-volume use.

I can't help but see an up-and-coming demand for services to fill this
need, as it's clear Kasper's setup isn't going to handle the load for all
the IPv6-only networks that decide there's actually content they want to
get to in the other half of the Alexa top 1000 sites that don't support
IPv6.  I'm enjoying being retired, but seeing a future demand with nobody
stepping up to fulfill that demand is almost enticing enough to be worth
un-retiring for to build out...

Thanks!

Matt

Re: maximum ipv4 bgp prefix length of /24 ?

2023-10-10 Thread Matthew Petach

On Tue, Oct 10, 2023 at 12:58 PM Delong.com via NANOG 
wrote:

> Isn’t this supposed to be one of the few ACTUAL benefits of RPKI — You can
> specify the maximum prefix length allowed to be advertised within a shorter
> prefix and those (theoretically) block hijackers taking advantage of
> advertising more specifics to cut you off?
>
> While I recognize that RPKI is not ubiquitous, enough of the major
> backbones are dropping RPKI invalids that I think any sort of hijacking in
> violation of that wouldn’t be very effective today.
>
> YMMV of course, but that seems to me to be a far better solution (almost
> enough to make me rethink the questionable value of RPKI) than
> disaggregation.
>
> Owen
>

Owen,

RPKI only addresses accidental hijackings.
It does not help prevent intentional hijackings.

RPKI only asserts that a specific ASN must originate a prefix.  It does
nothing to validate the authenticity of the origination.

If I am AS XX, and want to hijack a prefix from AS YY that has RPKI ROAs
protecting it, and AS YY has allowed more specifics to be announced within
the prefix range covered by the ROA, I'm in like flynn, because I just need
to configure my router with AS YY as the origin AS, then insert the
expected ASN for the neighbor adjacency with my upstreams, and bob's your
uncle, the more specific prefix passes RPKI validation, and traffic comes
flying my way.

If AS YY doesn't allow longer prefixes within the scope of their ROA, then
it's a bit dicier, because it comes down to AS-PATH length, but there's
still a good chance you can suck in traffic from your adjacent neighbors.

So yes, hijackings in violation of RPKI aren't as effective, but RPKI
doesn't prevent intentional hijackings--it just protects against accidental
misconfigurations and unintentional hijackings.

Thus, deaggregation is still very much part of the defensive toolbox, even
with RPKI in place.

As a side note, it's also a really good reason why you shouldn't allow
longer prefixes to be announced under your ROAs, except under very well
understood conditions.   ^_^;

Thanks!

Matt

Re: maximum ipv4 bgp prefix length of /24 ?

2023-10-09 Thread Matthew Petach

On Mon, Oct 9, 2023 at 11:38 AM Delong.com via NANOG 
wrote:

> [...]
>
> My grimmer picture for IPv4 is about the intrinsic pressure to deaggregate
> that comes from the ever finer splitting of blocks in the transfer market
> and the ever finer grained dense packing of hosts into prefixes that is
> forced from address scarcity. Those pressures don’t (or at least shouldn’t)
> exist for IPv6.
>

Well, it's also time to recognize and talk about the elephant in the room.

We know we can have an IPv4-only internet, we've been doing it for decades.

Our experiments thus far at an IPv6-only Internet have largely been (well,
honestly, *compeletely*) unsuccessful.  In order to exist on the Internet
today, you *must* have some IPv4 presence.  The reverse is not true; you
can exist on the Internet with no IPv6 resources.

As a result, as you noted, the pressure to split IPv4 ever-smaller so that
everyone gets a tiny piece of that essential pie is nearly infinitely
greater than it is for IPv6.

As a community, we have failed, because we never acknowledged and addressed
the need for backward compatibility between IPv6 and IPv4, and instead
counted on magic handwaving about tipping points and transition dates where
suddenly there would be "enough" IPv6-connected resources that new networks
wouldn't *need* IPv4 address space any more.

In doing so, we have sown the seeds of our own future pain and suffering.
By allowing IPv6 to be defined and established as an incompatible network
protocol to IPv4, we ensured that IPv4's future was assured.
*Every* transition mechanism we have for networks today relies on having
*some* amount of IPv4 address space for the translation gateway devices,
which will continue to drive an ever-increasing demand for smaller and
smaller chunks of IPv4 address space to be parceled out to every new
network that wants to join the Internet.

The only alternative is that web-scale companies like Amazon and Google
stand up swaths of IPv6-to-IPv4 translation gateway boxes, and provide
6-to-4 bidirectional translation services, with some clever marketing
person figuring out how to make money reliably from the service.

At that point, new entrants could conceivably get on board the Internet
with only IPv6 resources, with no need to scrabble for a chunk of
ever-decreasing IPv4 space to perform the necessary gateway translation for
their customers.

Unfortunately, because it's not just a mapping problem but an actual
packet-level incompatibility, the companies providing the magical
bidirectional translation service are going to be in the pathway for the
entire bitstream, making it a bandwidth-intensive product to deploy.  :(

On the plus side, they'd have the best view into everyone's traffic one
could ever hope for.  Forget just seeing DNS queries--you'd have visibility
into *everything* the users were doing, no matter how tiny and mundane it
might be.  Imagine the data mining potential!!

If I were younger, stupider, and much, much, MUCH richer, I might start a
company to do just that...

Matt

Re: maximum ipv4 bgp prefix length of /24 ?

2023-10-07 Thread Matthew Petach

On Sat, Oct 7, 2023 at 9:27 AM Willy Manga  wrote:

> Hi.
>
> On 06/10/2023 16:00, nanog-requ...@nanog.org wrote:
> > From: Matthew Petach
> [...]
> >
> > There's significantly less pressure to deaggregate IPv6 space right now,
> > because we don't see many attacks on IPv6 number resources.
> > Once we start to see v6 prefix hijackings, /48s being announced over /32
> > prefixes to pull traffic, then I think we'll see IPv6 deaggregation
> > completely swamp IPv4 deaggregation.
>
> How about we educate each other to not assume you must deaggregate your
> prefix especially with IPv6?
>

If you're the victim of a prefix hijacking, you don't really have a choice.
Right now, that's the only way to try to counteract a prefix hijacking; to
advertise something at least as specific as the prefix being hijacked, or
smaller if possible.

I see 'some' (it's highly relative) networks on IPv4, they 'believe'
> they have to advertise every single /24 they have. And when they start
> with IPv6, they replicate the same mindset with a tons of /48 . You can
> imagine what will happen of course.
>
> A better alternative IMHO is to take advantage to the large prefix range
> and advertise a sub-aggregate when necessary. But absolutely not each
> end-node or customer prefix.
>

Absolutely.
Right up the moment someone hijacks part of your IP space.
And then you announce a bunch of more specifics to try to counteract the
hijacking.
If you're a good, responsible network, you remove the more specific
prefixes once the hijacking is done.
If you're most networks, you're overworked, understaffed, and cleanup is at
the bottom of the priority list, so you just leave them being announced,
just in case someone tries to hijack your space again.

Most cases of deaggregation I've seen are the result of an event that took
place that triggered it, not just because people don't know better.

Now, RPKI can help a little bit, at least with protecting you from
accidental route leaks and unintended hijacks; but it only validates the
ASN originating the prefix, it doesn't validate the full pathway.  So,
being a determined hijacker, I'm going to set my router up to pretend to be
the correct origin ASN, and announce more specifics, adjusting the AS-PATH
to match what my neighbors and upstreams expect to see, and utter silent
thanks that most networks use a relatively liberal "max length" for the
prefixes in their ROAs (just in case *they* need to announce more specifics
to counteract my hijacking effort).

As we crack the BGP path validation nut, and put some means in place to
validate BGP adjacencies, this attack vector will fade away, and the need
to be able to announce more specifics willy-nilly will slowly go by the
wayside.  But for the moment, it's just as necessary in IPv6 as it is in
IPv4, though the resulting impact is less, because wise networks allocate
their IPv6 prefixes in a sparse manner, meaning that during a hijack event,
you only need to announce the matching /48s for the blocks carrying
relevant traffic, which should be a small fraction of your overall v6
assignment.

I completely agree that we should educate network engineers to only
advertise the largest prefix possible that covers your space.
But I also realize that in the world of non-secured BGP adjacencies and
non-validatable BGP AS-PATHs, we cannot fault people for having to
deaggregate during prefix hijacking events.

Thanks!

Matt

Re: maximum ipv4 bgp prefix length of /24 ?

2023-10-05 Thread Matthew Petach

On Wed, Oct 4, 2023 at 11:33 PM Mark Tinka  wrote:

>
>
> On 10/5/23 08:24, Geoff Huston wrote:
>
> The IPv6 FIB is under the same pressure from more specifics. Its taken 20
> years to get there, but the IPv6 FIB is now looking stable at 60% opf the
> total FIB size [2]. For me, thats a very surprising outcome in an
> essentially unmanaged system.
>
>
> Were you expecting it to be lower than IPv4?
>
> Mark.
>

I've dug through the mailman mirror on nanog.org, and there's currently no
post by Geoff Huston saying that:

https://community.nanog.org/search?q=geoff%20huston%20order%3Alatest

But I'll play along.

There's significantly less pressure to deaggregate IPv6 space right now,
because we don't see many attacks on IPv6 number resources.
Once we start to see v6 prefix hijackings, /48s being announced over /32
prefixes to pull traffic, then I think we'll see IPv6 deaggregation
completely swamp IPv4 deaggregation.
Either that, or content sites will simply turn off IPv6  records during
periods of attack, and let the traffic shift back to IPv4 instead.

When your IPv4 space gets hijacked, there's no fallback; you announce /24s,
because that's all you *can* do.
When your IPv6 space gets hijacked, there's always IPv4 as the fallback, so
there's less pressure to announce /48s for all your space, just in case
someone tries to hijack itl.
Otherwise, we would already be seeing the IPv6 deaggregation completely
overwhelming the IPv4 deaggregation.

Thanks!

Matt

Re: U.S. test of national alerts on Oct. 4 at 2:20pm EDT (1820 UTC)

2023-10-04 Thread Matthew Petach

On Wed, Oct 4, 2023 at 12:37 PM Sean Donelan  wrote:

> On Wed, 4 Oct 2023, Matthew Petach wrote:
> > Well, today's alert still showed up as "Presidential Alert", so I guess
> the
> > US hasn't quite finished changing over yet.  ^_^;
> > (Samsung Galaxy phone)
>
> Yeah, Samsung is bad about releasing software updates for its older (a few
> months old) products.
>
> Think about out-of-date security patches :-) if Samsung doesn't update a
> text field.

Ah, I didn't realize that was locally set on the device--I thought that was
part of the message header in the message being sent out.

Thanks for the clarification.  ^_^

Matt

Re: U.S. test of national alerts on Oct. 4 at 2:20pm EDT (1820 UTC)

2023-10-04 Thread Matthew Petach

On Wed, Oct 4, 2023 at 12:25 PM Sean Donelan  wrote:

>
> Emergency alerts are built into all android, ios and other mobile phones
> sold in almost every country during the last 5 years.  GSM standards are
> global.  The U.S. finally changed "presidential alert" to "national alert"
> recently.

Well, today's alert still showed up as "Presidential Alert", so I guess the
US hasn't quite finished changing over yet.  ^_^;
(Samsung Galaxy phone)

Matt

Re: cogent spamming directly from ARIN records?

2023-10-04 Thread Matthew Petach

On Mon, Oct 2, 2023 at 7:27 PM Collider 
wrote:

> Congrats! LIOAWKI is a hapax legomenon in DuckDuckGo's search results!
> Could you please tell me & the list what it means?
>

Large Internet Outages Are What Kills Income!

It's a phrase that is uttered by members of the finance organization every
time they see Network Engineers planning a "routine maintenance on the core
backbone routers".

Matt

Re: cogent spamming directly from ARIN records?

2023-10-02 Thread Matthew Petach

On Mon, Oct 2, 2023, 12:14 Mark Tinka  wrote:

>
>
> On 10/2/23 20:58, Tim Burke wrote:
>
> > Hurricane has been doing the same thing lately... but their schtick is
> to say that "we are seeing a significant amount of hops in your AS path and
> wanted to know if you are open to resolve this issue".
>
> I get what HE are trying to do here, as I am sure all of us do.
>
> The potential fallout is a declining relationship with their existing
> customers that bring other downstream ISP's behind them. Contacting
> those downstream ISP's to "resolve this issue" puts them at odds with
> their existing customers who bring those customers in already.
>
> There is a chance they dilute their income because, well, smaller ISP's
> will not be keen to pay the higher transit fees their upstreams pay to
> HE. Which means that HE are more willing to be closer to eyeballs than
> they are maximizing margins.
>

Huh?

In all my decades of time in the network industry, I have never seen a case
where a smaller transit contract had lower per mbit cost than a larger
volume contract.

I would expect that HE would make *more* money off 10 smaller customer
transit contracts than one big tier 3 wholesaler transit contract.

It seems like a win-win for HE:
more customer revenue *and* shorter hop-count paths they can advertise to
the rest of the world.

Is the loss of customer trust worth the transit-free glory?
>

When it's offset by more revenue?

Sure seems like it.   ;)

> Mark.
>

Matt

Re: maximum ipv4 bgp prefix length of /24 ?

2023-10-02 Thread Matthew Petach

On Mon, Oct 2, 2023 at 11:46 AM Tim Franklin  wrote:

> On 02/10/2023 19:24, Matthew Petach wrote:
>
> The problem with this approach is you now have non-deterministic routing.
>
> Depending on the state of FIB compression, packets *may* flow out
> interfaces that are not what the RIB thinks they will be.
> This can be a good recipe for routing micro-loops that come and go as your
> FIB compression size ebbs and flows.
>
> Had NOT considered the looping - that's what you get for writing in public
> without thinking it all the way through *blush*.
>
> Thanks for poking holes appropriately,
> Tim.
>

No worries--if this were easy, we would have been doing it decades ago
without thinking twice.

To William's point to Tom--we are perhaps using the term "compression" in
incompatible ways at times during this conversation.

There is a difference between what the papers Williams cited are doing,
which is finding more optimal ways of storing the full structure in memory
with what I think the general thread here is talking about, which is
'proxy-aggregation' of a form--reducing the actual number of entries in the
forwarding table, regardless of the *method* of storage.

"FIB compression" of the form talked about in the papers William cited is
already being done; we don't store whole string representations of the
routing table in memory, and look them up sequentially, we store them in
binary tries, which are faster and take up less space (e, compressed), but
they still encode and represent the *whole* set of prefixes in the
forwarding table.

"FIB-count-reduction" would be a more accurate term for what we're tossing
about here, and that's where dragons lie, because that's where your FIB and
RIB no longer represent the same set of information.  And while Jon is
right, it can help struggling ISPs stave off expensive upgrades, it does so
at the cost of potentially increased troubleshooting nightmares when
packets stop going where the RIB expects them to go, and network engineers
are left scratching their heads trying to figure out why.   ^_^;

As Mark just said--sane ISPs push their vendor for a knob to disable it, so
that they can return back the land of deterministic lookups for the sanity
of their engineers.  ;)

Thanks!

Matt

Re: maximum ipv4 bgp prefix length of /24 ?

2023-10-02 Thread Matthew Petach

On Mon, Oct 2, 2023 at 6:21 AM t...@pelican.org  wrote:

> On Monday, 2 October, 2023 09:39, "William Herrin"  said:
>
> > That depends. When the FIB gets too big, routers don't immediately
> > die. Instead, their performance degrades. Just like what happens with
> > oversubscription elsewhere in the system.
> >
> > With a TCAM-based router, the least specific routes get pushed off the
> > TCAM (out of the fast path) up to the main CPU. As a result, the PPS
> > (packets per second) degrades really fast.
> >
> > With a DRAM+SRAM cache system, the least used routes fall out of the
> > cache. They haven't actually been pushed out of the fast path, but the
> > fast path gets a little bit slower. The PPS degrades, but not as
> > sharply as with a TCAM-based router.
>
> Spit-balling here, is there a possible design for not-Tier-1 providers
> where routing optimality (which is probably not a word) degrades rather
> than packet-shifting performance?
>
> If the FIB is full, can we start making controlled and/or smart decisions
> about what to install, rather than either of the simple overflow conditions?
>
> For starters, as long as you have *somewhere* you can point a default at
> in the worst case, even if it's far from the *best* route, you make damn
> sure you always install a default.
>
> Then you could have knobs for what other routes you discard when you run
> out of space.  Receiving a covering /16?  Maybe you can drop the /24s, even
> if they have a different next hop - routing will be sub-optimal, but it
> will work.   (I know, previous discussions around traffic engineering and
> whether the originating network must / does do that in practice...)
>

The problem with this approach is you now have non-deterministic routing.

Depending on the state of FIB compression, packets *may* flow out
interfaces that are not what the RIB thinks they will be.
This can be a good recipe for routing micro-loops that come and go as your
FIB compression size ebbs and flows.

Taking your example:RTR-A--RTR-B-RTR-C
RTR-A is announcing a /16 to RTR-B
RTR-C is announcing a /24 from within the /16 to RTR-B, which is passing it
along to RTR-A

If RTR-B's FIB compression fills up, and falls back to "drop the /24, since
I see a /16", packets destined to the /24 arriving from RTR-A will reach
RTR-B,
which will check its FIB, and send them back towards RTR-Awhich will
send them back to RTR-B, until TTL is exceeded.

BTW, this scenario holds true even when it's a default route coming from
RTR-A, so saying "well, OK, but we can do FIB compression easily as long as
we have a default route to fall back on" still leads to packet-ping-ponging
on your upstream interface towards your default if you ever drop a more
specific from your FIB that is destined downstream of you.

You're better off doing the filtering at the RIB end of things, so that
RTR-B no longer passes the /24 to RTR-A; sure, routing breaks at that
point, but at least you haven't filled up the RTR-A to RTR-B link with
packets ping-ponging back and forth.

Your routing protocols *depend* on packets being forwarded along the
interfaces the RIB thinks they'll be going out in order for loop-free
routing to occur.
If the FIB decisions are made independent of the RIB state, your routing
protocols might as well just give up and go home, because no matter how
many times they run Dijkstra, the path to the destination isn't going to
match where the packets ultimately end up going.

You could of course fix this issue by propagating the decisions made by the
FIB compression algorithm back up into the RIB; at least then, the network
engineer being paged at 3am to figure out why a link is full will instead
be paged to figure out why routes aren't showing up in the routing table
that policy *says* should be showing up.

Understand which routes your customers care about / where most of your
> traffic goes?  Set the "FIB-preference" on those routes as you receive
> them, to give them the greatest chance of getting installed.
>
> Not a hardware designer, I have little idea as to how feasible this is - I
> suspect it depends on the rate of churn, complexity of FIB updates, etc.
> But it feels like there could be a way to build something other than
> "shortest -> punt to CPU" or "LRU -> punt to CPU".
>
> Or is everyone who could make use of this already doing the same filtering
> at the RIB level, and not trying to fit a quart RIB into a pint FIB in the
> first place?
>

The sane ones who care about the sanity of their network engineers
certainly do.   ^_^;

> Thanks,
> Tim.
>

Thanks!

Matt

Re: maximum ipv4 bgp prefix length of /24 ?

2023-10-01 Thread Matthew Petach

On Sun, Oct 1, 2023 at 11:25 AM Seth David Schoen 
wrote:

> Matthew Petach writes:
>
> > I would go a step further; for any system of compression hoping to gain a
> > net positive space savings,
> > Godel's incompleteness theorem guarantees that there is at least one
> input
> > to the system that will result in no space savings whatsoever.
>
> This is rather the Pigeonhole Principle that guarantees this.
>
> https://en.wikipedia.org/wiki/Lossless_compression#Limitations


Ah, thank you for the more specific pointer--a good read, though slightly
less entertaining than Hofstadter.  ^_^

Matt

Re: maximum ipv4 bgp prefix length of /24 ?

2023-10-01 Thread Matthew Petach

On Sun, Oct 1, 2023 at 1:03 AM Saku Ytti  wrote:

> On Sun, 1 Oct 2023 at 06:07, Owen DeLong via NANOG 
> wrote:
>
> > Not sure why you think FIB compression is a risk or will be a mess. It’s
> a pretty straightforward task.
>
> Also people falsely assume that the parts they don't know about, are
> risk free and simple.
>
> While in reality there are tons of proprietary engineering choices to
> make devices perform in expected environments, not arbitrary
> environments. So already today you could in many cases construct
> specific FIB, which exposes these compromises and makes devices not
> perform.

I would go a step further; for any system of compression hoping to gain a
net positive space savings,
Godel's incompleteness theorem guarantees that there is at least one input
to the system that will result in no space savings whatsoever.

If your device is counting on FIB compression to deliver sufficient space
savings to allow a FIB of size > SRAM to fit into SRAM,
it really should have a reasonable, sane fallback mode for when the next
routing update happens to result in a FIB that is incompressible.

Unfortunately, many coders today have not read Godel, Escher, Bach: An
Eternal Golden Braid,
and like the unfortunate Crab, consider their FIB compression algorithms to
be unbreakable[0].

As I discovered many years ago, at web scale, even seemingly
highly-improbable sequences of bits end up happening frequently enough to
become problematic.

In short: if you count on FIB compression working at a compression ratio
greater than 1 in order for your network to function, you had better have a
good plan for what to do when your phone rings at 3am because your FIB has
just become incompressible.   ^_^;

Matt

[0]. https://genius.com/Douglas-hofstadter-contracrostipunctus-annotated

Re: maximum ipv4 bgp prefix length of /24 ?

2023-09-29 Thread Matthew Petach

On Fri, Sep 29, 2023 at 10:43 AM VOLKAN SALİH 
wrote:

> thanks for your response. Honestly thanks for everyones reponses.
>
> comunism is the future. IMO.
>
> tier-1 network count is decreasing. competition is always good. while
> monopoly, duopoly, triopoly is not.. I dream an earth with 1000 tier-1
> networks..
>
> capitalism give people more money than they can spend in their lifetime
> with their families, but it doesnt give people happiness and health..
>
> for example, if i were level3 or telia CEO or should I have been major
> stakeholder? I would like to see 50 or 1000 more tier-1 networks competing
> with us.
>
> Money is not everything. After some time capitalist bourgeois realize that
> they could not "earn" health or happiness and start spending their pennies
> to charities,
>

Volkan,

You make a good point here.   I asked you to come up with a win-win
scenario that would persuade the financial people at the top 100 networks
why they should spend money to upgrade their networks for you.

I hadn't considered the charity option.

If you can organize your work as a non-profit charity, you may find there
are entities that will sponsor the number resource needs of your non-profit
charity.
That would be one way to achieve what you're looking for without having to
make a revenue-based pitch to 100 different CFOs.

> because if we wouldnt believe heaven and hell and purgatory, what else we
> could believe? Should we believe that after death nothing left from the
> earth, we worked for nothing, we laughed for nothing, we cried for nothing,
> and we married for nothing?
>
> NOPE.
>
> Everyone is equal, in the god's/lord's/creator's vision. You need to work
> on comunism instead of capitalism..
>
> I do not care what CFO/CTO/CEO/CXO thinks! they are more miserable than
> me..! I am healthy and happy. They are not. They can not be. I just
> expressed my opinions, finalized them with a bad joke. ;D
>
> You can continue your feasibility reports, net profit margin , return of
> investment calculations, but the god doesnt care IMO, and you will not care
> after you are 70-80 years.old.
>

I think the CFO would be quite happy if god were to show up and explain the
cost-benefit analysis of performing the network upgrades you are asking for.

...not so much because of what it would mean for the company's quarterly
numbers, but what it would mean for them on the daytime television circuit.
I mean, you can't *buy* that level of publicity!

"Today on The Talk--the CFO that met god...and lived to tell about it!"

Heck, after a meeting like that, I'd just retire from finance, and rake in
the money from the talk show appearances.  ;)

On a slightly more serious note--communism doesn't work when it comes to
network upgrades.  *someone* has to buy the router upgrades, and Juniper
doesn't accept "the will of the people" as a valid form of payment.

Like it or not, money makes the (networking) world go around, and without
it, new hardware isn't just going to show up on your doorstep.

Now, if you want, we can have a more serious discussion about *why*
backbone routers cost so much, and why quadrupling the size of the
forwarding table really makes such a big impact on the cost of a router.
But to have that discussion, I'm afraid we're going to have to leave god
out of it, and instead invite Mssrs Newton, Bernoulli, and Euler to the
table to go into some serious math about air flow, heat transfer, and
thermodynamics.   ^_^;

Thanks!

Matt

Re: maximum ipv4 bgp prefix length of /24 ?

2023-09-29 Thread Matthew Petach

On Fri, Sep 29, 2023 at 10:42 AM Valerie Wittkop  wrote:

> There is one person that reviews the moderation queue of the NANOG list.
> My morning was rather hectic, and I didn’t get to the queue until just
> before 12:30 EDT today.
>
> Apologies to all for the delay in the messages of this thread. Please note
>  I try to check the queue a few times throughout the day, and one last time
> again before I shut down for the night.
>

Ah, no worries, Valerie--I know how challenging it can be handling a task
like that.
Thank you for all your hard work on it!

I'm just curious how Owen was able to see and respond to the messages in
the thread before they were sent out to the rest of the list.  ^_^;

Perhaps he's got some super-sekret backdoor access?  Or is this a case
where interacting with Discourse gives one a leg up on the rest of the list?
(I notice I can see messages showing up in the Discourse mirror well before
they make it to my inbox).

Thank you again for the quick reply, and all the great work you do for the
community, Valerie--you totally ROCK!  :)

Thanks!

Matt

Re: maximum ipv4 bgp prefix length of /24 ?

2023-09-29 Thread Matthew Petach

On Fri, Sep 29, 2023 at 9:42 AM VOLKAN SALİH 
wrote:

> [...]
>
> I presume there would be another 50 big ASNs that belong to CDNs. And I am
> pretty sure those top 100 networks can invest in gear to support /25-/27.
>

Volkan,

So far, you haven't presented any good financial reason those top 100
networks should spend millions of dollars to upgrade their networks just so
your /27 can be multihomed.

Sure, they *can* invest in gear to support /25-/27; but they won't, because
there's no financial benefit for them to do so.

I know from *your* side of the table, it would make your life better if
everyone would accept /27 prefixes--multihoming for the masses, yay!

Try standing in their shoes for a minute, though.
You need to spend tens of millions of dollars on a multi-year refresh cycle
to upgrade hundreds of routers in your global backbone, tying up network
engineering resources on upgrades that at the end, will bring you exactly
$0 in additional revenue.

Imagine you're the COO or CTO of a Fortune 500 network, and you're meeting
with your CFO to pitch this idea.
You know your CFO is going to ask one question right off the bat "what's
the timeframe for us to recoup the cost of
this upgrade?" (hint, he's looking for a number less than 40 months).
If your answer is "well, we're never going to recoup the cost.  It won't
bring us any additional customers, it won't bring us any additional
revenue, and it won't make our existing customers any happier with us.  But
it will make it easier for some of our smaller compeitors to sign up new
customers." I can pretty much guarantee your meeting with the CFO will end
right there.

If you want networks to do this, you need to figure out a way for it to
make financial sense for them to do it.

So far, you haven't presented anything that would make it a win-win
scenario for the ISPs and CDNs that would need to upgrade to support this.

ON a separate note--NANOG mailing list admins, I'm noting that Vokan's
emails just arrived a few minutes ago in my gmail inbox.
However,  I saw replies to his messages from others on the list yesterday,
a day before they made it to the general list.
Is there a backed up queue somewhere in the NANOG list processing that is
delaying some messages sent to the list by up to a full day?
If not, I'll just blame gmail for selectively delaying portions of NANOG
for 18+ hours.   ^_^;

Thanks!

Matt

Re: constraining RPKI Trust Anchors

2023-09-26 Thread Matthew Petach

Job,

This looks fantastic, thank you!

For my edification and clarification, the reason you don't need a

deny 2000::/3

or

deny 0::/0

at the bottom of the ARIN list of allows is that every file comes with an
implicit "deny all", is that correct?

Is there a drawback to adding the explicit "deny 0::/0" at the bottom of
the file, to make it clear that everything else will return "invalid"?
I tend to prefer being explicit in my configurations, rather than depending
upon implicit behaviours which might change with future versions of
software releases.

Thanks!

Matt


On Tue, Sep 26, 2023 at 9:57 AM Job Snijders via NANOG 
wrote:

> Dear all,
>
> Two weeks ago AFRINIC was placed under receivership by the Supreme Court
> of Mauritius. This event prompted me to rethink the RPKI trust model and
> associated risk surface.
>
> The RPKI technology was designed to be versatile and flexible to
> accommodate a myriad of real-world deployment scenarios including
> multiple trust anchors existing, inter-registry transfers, multiple
> transports, and permissionless innovation for signed objects, for
> example. All good and well ... but ofcouse there is a fine print. :-)
>
> Over the years various people have expressed astonishment about each RIR
> having issued so-called 'all-resources' (0.0.0.0/0 + ::/0) trust anchor
> certificates, but this aspect often is misunderstood: the risk is not
> necessarily in the listing of 'all-resources' itself, it is in the RIR
> being able to issue an 'all-resources' certificate in the first place.
> RPKI trust anchor operators indeed can voluntarily reduce the scope of
> subordinate Internet Number Resources, but just as easily increase the
> scope of their authority. In other words, a trust anchor cannot truly
> constrain itself.
>
> Upon reconsideration on how exactly RPKI hooks into the real world; I
> concluded trust anchors do not require unbounded trust in order to
> provide constructive services in the realm of BGP routing security.
>
> Some examples: ARIN does not support inter-RIR IPv6 transfers, so it
> would not make any sense to see a ROA subordinate to ARIN's trust anchor
> covering RIPE-managed IPv6 space. Conversely, it wouldn't make sense
> to observe a ROA covering ARIN-managed IPv6 space under APNIC's,
> LACNIC's, or RIPE's trust anchor - even if a cryptographically valid
> certificate path existed. Along these lines AFRINIC doesn't support
> inter-RIR transfers of any kind; and none of the RIRs have authority
> over private resources like 10.0.0.0/8 or AS 65535. It seems feasible to
> paint constraints around RPKI trust anchors in broad strokes.
>
> Over the last two weeks I've diligently worked with Theo Buehler to
> research RIR transfer policies, untangle the history of the IANA->RIR
> and RIR->RIR allocation spaghetti, design & implement a maintainable
> constraints mechanism for rpki-client(8), and publicly document the
> concept of applying operator-defined policy to derived trust arcs.
>
> Please take a moment to read
>
> https://www.ietf.org/archive/id/draft-snijders-constraining-rpki-trust-anchors-00.html
>
> Your feedback is appreciated.
>
> Kind regards,
>
> Job
>

Re: Zayo woes

2023-09-20 Thread Matthew Petach

On Tue, Sep 19, 2023 at 12:21 PM Mike Hammett  wrote:

> Well sure, and I would like to think (probably mistakenly) that just no
> one important enough (to the money people) made the money people that these
> other things are *REQUIRED* to make the deal work.
>
> Obviously, people lower on the ladder say it all of the time, but the
> important enough money people probably don't consider those people
> important enough to listen to.
>

Not quite.

It's more of what Mark said:

"  I blame this on the success of how well we have built the Internet with
whatever box and tool we have, as network engineers."

I have worked time and time again with absolute miracle workers in the
networking field.
They say over and over again "to make this work, we need $X M to get the
right hardware", even directly to the CFO.

They get handed a roll of duct tape, some baling wire, a used access point
and a $25 gift card from Office Depot, and they turn it into a functional
BGP-speaking backbone, because that's what they're good at.

The CFO and the rest of the executives that said "no" to the request for $X
M to make the integration work properly pat themselves on the back, saying
"see, we knew they didn't really NEED that money to make it work."

A year down the line, customers are posting to NANOG wondering why things
are going to hell in a handbasket at ISP A, as the BGP-speaking access
point with some duct tape, baling wire, and SFPs purchased from Office
Depot that ties the two networks together starts failing.

As network engineers, we collectively set ourselves up for this by being so
damn good at pulling miracles out of our backside to keep things running.
We've effectively been training our executives that if they habitually turn
down our requests for resources, we'll still find some way of making things
work.

We pride ourselves on being able to keep a dozen spinning plates going like
a circus performer, without letting any of them crash to the floor.

It's a hard thing to do, but one lesson I've taught junior network
engineers of all ages is that sometimes, you have to step back, and watch a
plate smash into the floor, *even if you could have rescued it*, if it
seems like that's the only way your executive team will understand that if
requests for necessary resources are denied, there will be operational
impacts.

Now, it's not something you should do lightly, and not something to do
without first working with the executives to understand why the resource
request is being denied.
If you are working at a startup, and the money is running out, and the
company is one step ahead of the creditors, probably not the time to put
the foot down and intentionally let things crash and burn.

But if the company is doing well, has the money, and the executives just
want the numbers to look good for wall street analysts, then it's time to
pause the miracle working, and help them understand that they cannot simply
expect you to pull a miracle out of your backside every time, just so they
can look good.

If we continue to pull off miracles after telling executives that
additional resources are required, it's no wonder they don't take the
requests as seriously as they should.  ^_^;

Matt

> --
> *From: *"Mark Tinka" 
> *To: *nanog@nanog.org
> *Sent: *Tuesday, September 19, 2023 10:28:26 AM
> *Subject: *Re: Zayo woes
>
>
>
> On 9/19/23 16:48, Mike Hammett wrote:
>
> As someone that has been planning to be in the acquiring seat for a while
> (but yet to do one), I've consistently passed to the money people that
> there's the purchase price and then there's the % on top of that for
> equipment, contractors, etc. to integrate, improve, optimize future
> cashflow, etc. those acquisitions with the rest of what we have.
>
>
> I blame this on the success of how well we have built the Internet with
> whatever box and tool we have, as network engineers.
>
> The money people assume that all routers are the same, all vendors are the
> same, all software is the same, and all features are easily deployable. And
> that all that is possible if you can simply do a better job finding the
> cheapest box compared to your competition.
>
> In general, I don't fancy nuance when designing for the majority. But with
> acquisition and integration, nuance is critical, and nuance quickly shows
> that the acquisition was either underestimated, or not worth doing at all.
>
> Mark.
>
>
>

Re: Zayo woes

2023-09-19 Thread Matthew Petach

On Tue, Sep 19, 2023 at 7:19AM Mike Hammett  wrote:

> [...]
> I've never understood companies that acquire and don't completely
> integrate as quickly as they can.
>

Ah, spoken with the voice of someone who's never been in the position of:
a) acquiring a company not-much-smaller-than-you that
b) runs on completely different hardware and software and
c) your executives have promised there will be cost savings after the
merger due to "synergies" between the two companies.
^_^;

Let's say you're an all J shop; your scripts, your tooling, everything
expects to be talking to J devices.

Your executives buy a company that has almost the same size network--but
it's all C devices running classic IOS.

You can go to your executives and tell them "hey, to integrate quickly with
our network and tooling, we need to swap out all their C gear for J gear;
it's gonna cost an extra $50M"
The executives respond by pointing at c) above, and denying the request for
money to convert the acquired network to J.

You can go to your network and say "hey, we need to revamp our tooling and
systems to understand how to speak to C and J devices equally, in spite of
wildly different syntaxes for route-maps and the like-it's going to take 4
more developer headcount to rewrite all the systems."
The executives respond by pointing at c) above, and deny the request for
developer headcount to rewrite your software systems.

The general result of acquisitions of similar-sized companies is that the
infrastructure runs in parallel, slowly getting converted over and unified
as gear needs to be replaced, or sites are phased out--because any other
course of action costs more money than the executives had promised the
shareholders, the board, or the VCs, depending on what stage your company
is at.

Swift integrations cost money, and most acquisitions promise cost savings
instead of warning of increased costs due to integration.

That's why most companies don't integrate quickly.  :(

Matt

Re: AFRINIC placed in receivership

2023-09-16 Thread Matthew Petach

On Sat, Sep 16, 2023 at 6:24 AM Eric Kuhnke  wrote:

>
> https://www.devdiscourse.com/article/international/1813989-the-strange-case-of-africas-stolen-ip-addresses
>

"When African ISPs allocate IP addresses to all those new electronic
devices, they will not be using the legacy IPv4 addresses AFRINIC is
currently risking its existence over. Instead, they will be allocating

the
IPv6 addresses that represent the future of the Internet, both inside and
outside Africa."

*whew*

OK, that was a good laugh.
I needed some humour to start my day off.  ;P

Sorry--any article that ignores the non-starting aspect of IPv6-only
connectivity isn't worth the electrons it's (not) printed on.  :/

The sad fact of the Internet today is that without at least *some* IPv4
addresses, you're not on the Internet.
Sure, you can do 464XLAT and other things like that to *minimize* the
amount of IPv4 addresses you need, but you can't run a pure IPv6-only
network today for consumer use; there's too much of the Internet you just
can't access without at least some IPv4 presence.   And as such, that means
that every ISP, every company that wants to be multihomed to more than one
upstream provider requires allocations of *both* IPv4 and IPv6 addresses in
order to be functional.

I think it's an ugly situation all around, but from my reading of the
Consolidated Resource Policy Manual,
what Cloud Innovations did is clearly against the intent stated in the
AFRINIC policy manual:
"5.4.6.2 AFRINIC resources are for AFRINIC service region and any use
outside the region should be solely in support of connectivity back to the
AFRINIC region."

That clause has been in the AFRINIC Consolidated Policy Resource Manual
since version 0.1, published nearly a decade ago in 2014.
https://afrinic.net/cpm-0-1

Now, if I had been involved in crafting the policy document, I would have
strongly recommended that the particular clause be included in section 5.2,
rather than 5.4, as it really should have been broadly applicable no matter
what phase of exhaustion the IPv4 pool happened to be in at the time.  By
tucking it in under 5.4, in the "Soft Landing" portion of the document, it
wrapped the regional requirement under a relatively restrictive scope:
"This IPv4 Soft Landing policy applies to the management of address space
that will be available to AFRINIC after the current IPv4 pool is depleted.
The purpose of this document is to ensure that address space is assigned
and/or allocated in a manner that is acceptable to the AFRINIC community
especially during this time of IPv4 exhaustion."

Had policy 5.4.6.2 instead been policy 5.2.1.5, this would be a moot
discussion, and Cloud Innovations would clearly be in the wrong, and
AFRINIC would be clearly justified in clawing the number resources back.

However, because the regional use restriction was tucked under the rubric
of the "applies to the management of address space that will be available
to AFRINIC *after* the current IPv4 pool is depleted" stipulation (emphasis
mine), it leaves the argument open that until AFRINIC completely exhausted
its available IPv4 pool, no such regional restriction should apply.

I do not envy either party in this fight.

But if nothing else, it can provide guidance on why number policy matters,
and why it is useful to have contrarians that look at every clause and
wonder "could this be abused in a way we hadn't considered?"   ^_^;

Matt

Re: IP Planning and Modelling Tools

2023-08-24 Thread Matthew Petach

On Wed, Aug 23, 2023 at 12:15 AM Pascal Masha  wrote:

> Hello Folks,
>
> Any good alternatives to Ciena Blue Planet out there?
>
> Regards,
> Paschal Masha
>

Hi Pascal,

I'm curious--what is it you need to do that you can't do within Netbox?
(https://netbox.dev/)

Matt

Re: Internet Exchange Visualization

2023-08-22 Thread Matthew Petach

*facepalm*

You asked for a cost-free, publicly visible and available tool.

The lack of such does *not* mean tools don't exist.  It just means you
won't find them available for free to the general public.

Asking if X exists and being told 'no' does not say anything about whether
Y exists or not.

People who need to know have tools.

Those tools are generally not free, however.

Matt



On Tue, Aug 22, 2023, 01:40 Thomas Beer  wrote:

> Hi All!
>
> to make an (intermediate) summary so far, it's 2023 and there are no tools
> available
> for BGP, ASN and IX interconnection visualization static or dynamic?!
>
> Nobody has a top-level understanding / awareness of the infrastructure
> topology and fixes
> "bottlenecks", route misconfiguration et al. on a peer - to - peer basis?!
>
> Cheers
> Tom
>
> On Tue, 22 Aug 2023 at 02:34, Dave Taht  wrote:
>
>> I hear the cybergeography project is making a comeback.
>>
>>
>> https://personalpages.manchester.ac.uk/staff/m.dodge/cybergeography/atlas/atlas.html
>>
>> On Mon, Aug 21, 2023 at 5:17 PM Matthew Petach 
>> wrote:
>> >
>> >
>> >
>> > On Sun, Aug 20, 2023 at 11:06 PM Thomas Beer 
>> wrote:
>> >>
>> >> Hi Matt,
>> >>
>> >>>
>> >>> You might mean "exchange inter-connections" as "how are the different
>> internet exchanges connected to each other?"
>> >>> in which case the answer is generally "through the Internet".  ^_^;
>> >>
>> >>
>> >> I meant ix internet exchange path visualization and an online tool to
>> take a look at it in (near) real time!
>> >>
>> >> Cheers
>> >>
>> >
>> >
>> > Ah, thank you for the enlightening clarification.
>> >
>> > No such tool exists, sorry.
>> >
>> > Thanks!
>> >
>> > Matt
>> >
>>
>>
>>
>> --
>> Podcast: https://www.youtube.com/watch?v=bxmoBr4cBKg
>> Dave Täht CSO, LibreQos
>>
>

Re: Internet Exchange Visualization

2023-08-21 Thread Matthew Petach

On Sun, Aug 20, 2023 at 11:06 PM Thomas Beer  wrote:

> Hi Matt,
>
>
>> You might mean "exchange inter-connections" as "how are the different
>> internet exchanges connected to each other?"
>> in which case the answer is generally "through the Internet".  ^_^;
>>
>
> I meant ix internet exchange path visualization and an online tool to take
> a look at it in (near) real time!
>
> Cheers
>
>

Ah, thank you for the enlightening clarification.

No such tool exists, sorry.

Thanks!

Matt

Re: Destination Preference Attribute for BGP

2023-08-18 Thread Matthew Petach

On Fri, Aug 18, 2023 at 2:36 PM Mark Tinka  wrote:

> [...]
> To be fair, you are talking about an arbitrary value of years back, on
> boxes you don't name running code you won't mention.
>
> This really not saying much :-).
>

Hi Mark,

I know it's annoying that I won't mention specifics.
Unfortunately, the last time I mentioned $vendor-specific information on
NANOG, it was picked up by the press, and turned into a multimillion dollar
kerfuffle with me at the center of the cross-hairs:
https://www.google.com/search?q=petach+kablooie_esv=558180114=petah+kablooie=0=1580=1008=2

After that, I've learned it's best to not name specific very-big-name
vendors on NANOG posts.

What I *can* say is that this was one of the primary vendors in the
Internet backbone space, running mainstream code.
The only reason it didn't affect more networks was a function of the
particular cluster of signalling communities being applied to all inbound
prefixes, and how they interacted with the vendor's hash algorithm.

Corner cases, while valid, do not speak to the majority. If this was a
> major issue, there would have been more noise about it by now.
>

I prefer to look at it the other way; the reason you didn't hear more noise
about it, is that we stubbed our toes on it early, and had relatively fast,
direct access to the development engineers to get it fixed within two
days.  It's precisely *bcause* people trip over corner cases and get them
fixed that they don't end up causing more widespread pain across the rest
of the Internet.

> There has been quite some noise about lengthy AS_PATH updates that bring
> some routers down, which has usually been fixed with improved BGP code. But
> even those are not too common, if one considers a 365-day period.
>

Oh, absolutely.  Bugs in implementations that either crash the router or
reset the BGP session are much more immediately visible than "that's odd,
it's taking my routers longer to converge than it should".

How many networks actually track their convergence time in a time series
database, and look at unusual trends, and then diagnose why the convergence
time is increasing, versus how many networks just note an increasing number
of "hey, your network seems to be slowing down" and throw more hardware at
the problem, while grumbling about why their big expensive routers seem to
be less powerful than a *nix box running gated?

I suspect there's more of these type of "corner cases" out there than you
recognize.
It's just that most networks don't dig into routing performance issues
unless it actually breaks the router, or kills BGP adjacencies.

If you *are* one of the few networks that tracks your router's convergence
time over time, and identifies and resolves unexpected increases in
convergence time, then yes, you absolutely have standing to tell me to pipe
down and go back into my corner again.  ;D

> Mark.
>

Thanks!

Matt

Re: Internet Exchange Visualization

2023-08-15 Thread Matthew Petach

On Tue, Aug 15, 2023 at 8:24 AM Thomas Beer  wrote:

> Hi All!
>
> Has anybody a link to a cost-free service for visualizations of internet
> exchange inter-connections?
>
> Thanks & cheers
> Tom
>

Hi Tom,

There's a couple of different ways of interpreting your question, which
will impact the answers you get.

You might mean "exchange inter-connections" as "how are the different
internet exchanges connected to each other?"
in which case the answer is generally "through the Internet".  ^_^;

You might instead be thinking of "how are different participants in a
single internet exchange cross-connected to
each other?"  -- in which case the answer is "through in-building wiring
that often even the building owner isn't
entirely aware of what path the connections are taking."   ^_^;

You might also be asking about BGP relationships, not physical connections,
in which case mining route-views,
RIPE, and other BGP data sources, along with PeeringDB will allow you to
see a percentage of the picture, though
not the entire model.

You might also be asking about "visualization" in the sense of "looking at
traffic volumes across interconnections",
at which point you're whistling in the dark; other than the aggregate
traffic volume visualization that some exchanges
provide, nobody is sharing their traffic graphs externally, sorry.

So, the first step to getting a meaningful answer is to clarify your
question a bit more.
Are you asking about interconnections *between* internet exchanges, or
interconnections *within* a single exchange?
Are you looking for physical layer interconnection information, or logical
(BGP neighbor) interconnection information?
Are you just looking for a binary "does an interconnection relationship
exist", or are you looking to visualize traffic volumes across that
relationship?

If you can provide a bit more clarity in what you're looking for, we'll
have a better idea of how exactly to tell you you're out of luck.   ^_^;

Thanks!

Matt

Re: Cogent Abuse - Bogus Propagation of ASN 36471

2023-07-20 Thread Matthew Petach

On Thu, Jul 20, 2023 at 8:09 AM Pete Rohrman 
wrote:

> Ben,
>
> Compromised as in a nefarious entity went into the router and changed
> passwords and did whatever.  Everything advertised by that comprised router
> is bogus.  The compromised router is owned by OrgID: S2NL (now defunct).
> AS 36471 belongs to KDSS-23
> .  The
> compromised router does not belong to Kratos KDSS-23
> , and is
> causing routing problems.  The compromised router needs to be shut down.
> The owner of the compromised router ceased business, and there isn't anyone
> around to address this at S2NL.  The only people that can resolve this is
> Cogent.   Cogent's defunct customer's router was compromised, and is
> spewing out bogus advertisements.
>
> Pete
>

Hi Pete,

This seems a bit confusing.

So, S2NL was a bill-paying customer of Cogent with a BGP speaking router.
They went out of business, and stopped paying their Cogent bills.
Cogent, out of the goodness of their hearts, continued to let a non-paying
customer keep their connectivity up and active, and continued to freely
import prefixes across BGP neighbors from this non-paying defunct customer.
Now, someone else has gained access to this non-paying, defunct customer's
router (which Cogent is still providing free connectivity to, out of the
goodness of their hearts), and is generating RPKI-valid announcements from
it, which have somehow not caused a flurry of messages on the outages list
about prefix hijackings.

The elements to your claim don't really seem to add up.
1) ISPs aren't famous for letting non-bill-paying customers stay connected
for very long past the grace period on their billing cycle, let alone long
after the company has gone belly-up.
2) It's not impossible to generate RPKI-valid announcements from a hijacked
network, but it's very difficult to generate *bogus* RPKI-valid
announcements from a compromised router--that's the whole point of RPKI, to
be able to validate that the prefixes being announced from an origin are
indeed the ones that are owned by that origin.

Can you provide specific prefix and AS_PATH combinations being originated
by that router that are "bogus" and don't belong to the router's ASN?

If, however, what you meant is that the router used to be ASN X, and is
now suddenly showing up as ASN 36471, and Cogent happily changed their BGP
neighbor statements to match the new ASN, even though the entity no longer
exists and hasn't been paying their bills for some time, then that would
imply a level of complicity on Cogent's part that would make them unlikely
to respond to your abuse reports.  That would be a very strong allegation
to make, and the necessary level of documented proof of that level of
malfeasance would be substantial.

In short--I'm having a hard time understanding how a non-paying entity
still has working connectivity and BGP sessions, which makes me suspect
there's a different side to this story we're not hearing yet.   ^_^;

Thanks!

Matt

>

Re: 1299 capacity constraints

2023-07-16 Thread Matthew Petach

On Fri, Jul 14, 2023, 15:27 Ross Tajvar  wrote:

> It extremely depends on who you're trying to reach and from what location.
> We've seen lots of T1s have congested peering lately.
>

Whoa.

I thought I was the only one old-school enough to still be using a T1 for
connectivity.  Are people seriously actually trying to use T1s for peering
in this day and age?  ^_^;

Matt

Re: My first ARIN Experience but probably not the last, unfortunately..

2023-07-16 Thread Matthew Petach

On Fri, Jul 14, 2023 at 2:09 PM Darin Steffl 
wrote:

> This screams of entitlement. If you can't afford $250 a year for ARIN, you
> probably shouldn't be starting a new business. Sorry
>

#define SOAPBOX

Darin,

Please remember ARIN covers more than just the relatively prosperous United
States.
There are places like Jamaica, which are also in the ARIN region, where the
average
annual income is $2,337.

Having to put aside 11% of your annual income for ARIN registry fees to
start a business
is a big decision.
I don't think you'd like it if we called you "entitled" for not wanting to
shell out 11% of your
annual income for ARIN fees to start a business.

While NANOG by name does narrow the focus to just "North America", we
should all remember
that even in North America, wealth is not distributed equally.  There are
communities that very
much need the economic development that new businesses can bring, where a
$250/year annual
fee represents a significant headwind.  Rather than pooh-pooh their
concerns, we should instead
strive to see the world through that entrepreneur's eyes, and address their
concerns, rather than
brush them aside.

 Thanks!

Matt

#undef SOAPBOX

Re: BGP routing ARIN space in APNIC region

2023-06-09 Thread Matthew Petach

On Fri, Jun 9, 2023 at 6:17 PM Jon Lewis  wrote:

> On Fri, 9 Jun 2023, Matthew Petach wrote:
>
> >
> > Hi Mike,
> >
> > In general, no, there's nothing that prevents you from doing that.
> ...
> > Now, from a network reachability perspective, you should also think
> about your own internal network connectivity.
> > If you're using the same ASN in California and Makati, you'll need
> redundant internal network connections between the two countries to ensure
> you don't end up with a
> > partitioned ASN.
> > Remember, California won't accept the advertisements from Makati over
> the external Internet, as AS-PATH loop detection will drop the
> announcements; likewise, Makati won't
> > hear the advertisements of the California IP space.
>
> Every platform I've used has a knob for turning off / relaxing as-path
> loop detection.  Note, for some platforms (at least Juniper), you may also
> have to have your upstream provider "advertise-peer-as", though I suspect
> it's highly unlikely you'd have BGP service from the same upstream in both
> CA and PH...so this won't likely be an issue.
>

I'd recommend this be treated as a "BGP 201" level exercise, not a "BGP
101" knob to turn.

If you're asking for advice from the NANOG mailing list about how to best
set up your first
"remote" network location, you're in BGP 101 territory, and probably
shouldn't be
disabling as-path loop detection as a general rule.  ^_^;

No knock on you, just that it's probably best not to do that until you're a
lot more
comfortable with the potential gotchas that can result from making changes
to the
default BGP protocol behaviour on your border routers.

Thanks!

Matt

Re: BGP routing ARIN space in APNIC region

2023-06-09 Thread Matthew Petach

Hi Mike,

In general, no, there's nothing that prevents you from doing that.
In days gone by, some networks used to require consistent advertisements
from a given ASN in all locations in order to peer.
In your case, that would have made it economically disadvantageous to use
the same ASN in Makai as California, as you'd end up backhauling a lot of
traffic.
These days, consistent advertisement requirements have largely gone by the
wayside.
Now, from a network reachability perspective, you should also think about
your own internal network connectivity.
If you're using the same ASN in California and Makati, you'll need
redundant internal network connections between the two countries to ensure
you don't end up with a partitioned ASN.
Remember, California won't accept the advertisements from Makati over the
external Internet, as AS-PATH loop detection will drop the announcements;
likewise, Makati won't hear the advertisements of the California IP space.
So, if your network design is a single internal backbone link from CA to
PH, with an expectation that if the link goes down,  you can just use
transit providers to reach the other location, you'll be in for an unhappy
surprise when your backbone link goes down.
For that reason, many networks find that the cost of acquiring a second,
distinct ASN for the remote location is considerably lower than the
headache of trying to ensure the single ASN is never partitioned.

But that's really more from a network design perspective; from a policy
perspective, there's largely nothing preventing you from doing that.

Best of luck!

Matt

On Fri, Jun 9, 2023 at 12:28 PM Mike  wrote:

> Hello,
>
>  I'm certain this must have been covered before but I can't find a
> lot of good-seeming answers. Essentially, I am a California based ISP
> and have plans to open up shop in Makati Philippines. I have an ASN and
> several /22's of ipv4 and a few /44s of ipv6 out of my assigned ranges
> that I intend (desire) to bring with me. I am just wondering if there is
> any network policy, filtering, or other reason why I simply couldn't
> just pop up there advertising my space and away I go? I do have ROA
> setup with arin already which should otherwise verify/validate me (great
> tool by the way, thank you).
>
>
> Thank you.
>
>
>

Re: New addresses for b.root-servers.net

2023-06-07 Thread Matthew Petach

Hi Robert,

If the goal is increased robustness by having addresses present from a
different RIR,
wouldn't it make this whole tempest in a teapot moot if, instead of
*reunubering*, you
simply *added* a second set of IPs, but continued to answer queries on the
original
addresses as well?

Is there any reason at all to unconfigure the original IPs from the servers
after the LACNIC
IP addresses are added to the servers?  I mean, it's perfectly normal for
servers to have
multiple IP addresses on them, we've been doing it for decades, and IPv6
has really hammered
home that it's normal and expected for hosts to have multiple IP addresses
on them, often from
different providers.

Thanks!

Matt

On Sun, Jun 4, 2023 at 8:06 AM Robert Story  wrote:

> On Sat 2023-06-03 23:00:33+0200 Terrence wrote:
> > Forgive me if I'm missing something obvious, but why are you
> > renumbering at all?
> >
> > Of course the diversification of RIRs is a good thing, but couldn't
> > that be accomplished just as well by transferring the current
> > allocation to LACNIC?
>
> Hi Terrence,
>
> DNS Root Server addresses from ARIN are assigned from the critical
> infrastructure pool, and ARIN policy does not allow them to be
> transferred to another RIR. The relevant policy section is:
>
> 8.4. Inter-RIR Transfers to Specified Recipients
>
> [...]
>
> Conditions on source of the transfer:
>
> [...]
> Address resources from a reserved pool (including those designated
> in Section 4.4 and 4.10) are not eligible for transfer.
>
> Regards,
> --
> Robert Story
> USC Information Sciences Institute 
> Networking and Cybersecurity Division
>

Re: New addresses for b.root-servers.net

2023-06-02 Thread Matthew Petach

On Fri, Jun 2, 2023 at 10:40 AM William Herrin  wrote:

> On Fri, Jun 2, 2023 at 9:57 AM Jim  wrote:
> > A major concern would be if the IP address were eventually re-assigned
> to something else that
> > ended up reporting false answers due to a malicious or misconfigured DNS
> service.
>
> Hi Jim,
>
> That's one reason I suggested intentionally making it a false
> responder for the final year of its post-service hold. Return wildcard
> A and  records for all queries pointing to a web site which
> responds to any URL with, "Hey buddy, your DNS software is so grossly
> out of date that now it's broken and will stay broken until you fix
> it."
>
> Anybody still sending queries after that gets what they get and
> deserves it -- as long as the time that passes until the final year is
> long enough that only the most reckless and incompetent users are
> still sending queries.
>

I think you underestimate the time frames involved in some projects.
My older brother was deeply involved in the James Webb space telescope
project.
At one point, while visiting him at the giant clean room in Redondo Beach,
we started talking about the specifications on the computers onboard the
telescope.  I was aghast at how out-of-date the systems being installed
were,
and noted I could pop over to Fry's and pick up something with 20x the
memory,
running 10x as fast with pocket money.
He countered by pointing out there were thousands of subcontractors
involved
in the project, and everything had to come together smoothly at the end.
Once
the design work was completed, *everything* was frozen; no changes were
allowed,
no matter how well-intentioned, because there could be unanticipated ripple
effects
on other components being worked on by completely independent
subcontractors.
The end result being that what was being launched was based on hardware and
software that was finalized nearly two decades earlier.

It's a bit unkind to think that only "reckless and incompetent users" will
still be
sending queries years later, when there are plenty of projects like the
James
Webb space telescope where the elements were locked in years before any
decision to renumber root servers might have been made.

I agree with Jim.  Once a block was in use by a root server instance,
encoded
in root hints files, it should be forever reserved as such.  If we want to
make
use of different RIRs and distribute responsibility around the planet,
transfer
the ownership of a block from one RIR to another; don't count on everything
on and off the planet being able to update their root hints.

Thanks!

Matt

Re: Do ISP's collect and analyze traffic of users?

2023-05-16 Thread Matthew Petach

On Tue, May 16, 2023 at 1:10 AM Jeroen Massar  wrote:

>
>
> > On 16 May 2023, at 06:46, Matthew Petach  wrote:
> > [..]
> > I admit, I'm perhaps a little behind on the latest netflow whiz-bangs,
> > but I've never seen a netflow record type that included HTTP cookies
> > or PCAP data before.
>
> Take your pick from the "latest" ~2009 IPFIX Information Elements:
>
> https://www.iana.org/assignments/ipfix/ipfix.xhtml
>
> One can stuff almost anything in there.
>
> Now if one should, and if one is allowed to.
>

Wow.

Thank you, Jeroen, I was indeed a bit out of date.
Thank you for the pointer!

(For those in the same boat as I, here's the relevant portion that clearly
points out that yes, you can export the entire packet if you so desire):

313 ipHeaderPacketSection octetArray default current

This Information Element carries a series of n octets from the IP header of
a sampled packet, starting sectionOffset octets into the IP header.

However, if no sectionOffset field corresponding to this Information
Element is present, then a sectionOffset of zero applies, and the octets
MUST be from the start of the IP header.

With sufficient length, this element also reports octets from the IP
payload. However, full packet capture of arbitrary packet streams is
explicitly out of scope per the Security Considerations sections of [RFC5477
<https://www.iana.org/go/rfc5477>] and [RFC2804
<https://www.iana.org/go/rfc2804>].

 Thanks!

Matt
(still learning after all these years.   ^_^ )

Re: Do ISP's collect and analyze traffic of users?

2023-05-15 Thread Matthew Petach

On Mon, May 15, 2023 at 6:42 PM Dave Phelps  wrote:

> I think it's safe to assume they are selling such data.
>
>
> https://www.techdirt.com/2021/08/25/isps-give-netflow-data-to-third-parties-who-sell-it-without-user-awareness-consent/
>
>
> https://www.vice.com/en/article/dy3z9a/fbi-bought-netflow-data-team-cymru-contract
>

>From the second article:

"Team Cymru’s products can also include data such as URLs visited, cookies,
and PCAP data"

Really?  From Netflow?

I admit, I'm perhaps a little behind on the latest netflow whiz-bangs,
but I've never seen a netflow record type that included HTTP cookies
or PCAP data before.

Certainly, the products listed on the Team Cymru website don't make any
mention
of including cookies or PCAP data, at least not from what I've been able to
ascertain from digging through their product listing.

Is there some secret "off the menu" product that allows one to purchase a
data feed that includes cookies and PCAP data?

Matt

Caveat emptor: avoid Inseego 5G products unless you still believe in classful routing

2023-03-28 Thread Matthew Petach

In the category of "I can't believe I still have to worry about this in
2023"
comes an unfortunate discovery I made recently when setting up a network
for a local non-profit.  The Inseego FX2000 5G router looked like a nice
product, it supports OpenVPN out of the box, flexible firewall rules, etc.

What I did *NOT* expect from a device made in 2023, and didn't think to
ask about ahead of time, is whether it supported classless routing.

Setting the unit up, I discovered the hard way that the developers are
apparently still working from 1989 textbooks.  The only netmask the
router will accept for a 10.x.x.x. subnet is 255.0.0.0.  Absolutely refuses
to accept a different length netmask.

Even the user manual reflects the inherent classful assumption:

"
IPv4
IP Address: The IP address for your FX2000, as seen from the local network.
Normally, you can use the default value.
Subnet Mask: The subnet mask network setting for the FX2000. The default
value 255.255.255.0 is standard for small (class "C") networks. If you
change the LAN IP Address, make sure to use the correct Subnet mask for the
IP address range of the LAN IP address
"

So, before anyone else makes the same mistake I did, I thought I'd give the
community a heads-up to avoid the Inseego line of 5G products, as they're
woefully behind the times in their understanding of IPv4 subnetting as it
exists in 2023.  ^_^;

Thanks!

Matt

Re: Verizon/Qwest single end-user difficulty vs Xfinity

2023-03-19 Thread Matthew Petach

On Sat, Mar 18, 2023 at 12:52 PM Jeff Woolsey  wrote:

> Verizon 5G Internet Support is not at a high-enough pay grade to assess
> this problem...  So I'm turning to y'all.
>
> I'm trying to save $$$ and increase speed, using Verizon 5G Home
> Internet to replace XFinity, even though they gave me a faster modem a
> few weeks ago.  I run both of the modems in Bridge/Passthrough mode.
>

Uh...there's a pretty big difference between "Bridge" and "IP Passthrough";

I suspect you're actually running IP Passthrough, *not* bridge, and therein
may lie your problem.

In Bridge mode, the CPE acts as a layer 2 device, and by and large does not
get involved in layer 3 politics.

In IP Passthrough mode, the CPE is the layer 3 termination point for the IP
address; it looks at the five tuple to determine if the packet is one that
*it*
needs to accept (management traffic from the ISP to the CPE), in which case
it is handed to the CPE CPU to process locally; otherwise, the destination
MAC
is altered to the customer's router MAC address, and the frame is re-sent
out
the LAN side towards the customer's router.

Because the CPE is the initial termination point for the layer 3 connections
in IP Passthrough mode, you have two points of possible interaction:
1) you should make sure any and all firewall settings, content filters, and
ALGs are disabled on the CPE, as they will still block traffic from being
passed through
and
2) any port/protocol tuple on the CPE that is used for managing the device
from the ISP end *cannot* be passed through to the customer router, as it
will be intercepted and terminated on the CPE CPU locally.

So--if you've turned off every family filter option, every firewall rule,
and ALG, and you still can't reach that port,
I suspect you're trying to use a port that is one that the ISP uses for
managing their CPE devices, such as TCP 7547.
Try switching to a different port number, and see if your connection works
as expected.
For more exhaustively in-depth details of what 5-tuples your CPE in IP
Passthrough will ingest upstream of you,
I refer you to
https://www.broadband-forum.org/download/TR-069_Amendment-5.pdf
specificially sections 3.2.2 and Annex K, starting on page 185.

Best of luck!

Matt

Re: Starlink routing

2023-01-22 Thread Matthew Petach

On Sun, Jan 22, 2023 at 2:45 PM Michael Thomas  wrote:

> I read in the Economist that the gen of starlink satellites will have
> the ability to route messages between each satellite. Would conventional
> routing protocols be up to such a challenge? Or would it have to be
> custom made for that problem? And since a lot of companies and countries
> are getting on that action, it seems like fertile ground for (bad) wheel
> reinvention?
>
> Mike
>
>

Unlike most terrestrial links, the distances between satellites are not
fixed,
and thus the latency between nodes is variable, making the concept of
"Shortest Path First" calculation a much more dynamic and challenging
one to keep current, as the latency along a path may be constantly changing
as the satellite nodes move relative to each other, without any link state
actually
changing to trigger a new SPF calculation.

I suspect a form of OLSR might be more advantageous in a dynamic partial
mesh between satellites, but I haven't given it as much deep thought as
would
be necessary to form an informed opinion.

So, yes--it's likely the routing protocol used will not be entirely
"off-the-shelf"
but will instead incorporate continuous latency information in the LSDB,
and path selection will be time-bound based on the rate of increase in
latency
along currently-selected edges in the graph.

An interesting problem to dive into, certainly.   :)

Thanks!

Matt

Re: AS3356 Announcing 2000::/12

2022-12-09 Thread Matthew Petach

On Thu, Dec 8, 2022 at 9:35 AM Randy Bush  wrote:

> while i think the announcement is, shall we say, embarrassing, i do not
> see how it would be damaging.  real/correct announcements would be for
> longer prefixes, yes?
>
> randy
>

 Putting on a probably-overly-paranoid hat for a moment...

If I announce 2000::/12, seemingly as an innocent error,
it won't break most people's routing, and is likely to be simply
chalked up as a copy-paste error, or other human "oops".

But if I happen to be running a promiscuous packet capture
on a box that the "erroneous" routing table entry ultimately
resolves to, I warrant there's a certain amount of legitimate
packet streams I could collect here and there, any time a
router processes a WITHDRAW update message for a more
specific prefix within the range, before a new ANNOUNCE
update message is processed.

I'm not going to get a great deal of information, as most
simple prefix updates happen within the same update
message; but during periods of higher internal churn in a
network, you may have brief periods during which the more
specific route is withdrawn before being re-announced, during
which I'd be able to harvest packets destined for other networks.

As I said--I'm probably being overly paranoid, but I can't help but
wonder what packets such a collector might see, if left to run for a
week or two... ^_^;

Thanks!

Matt

Re: ingress/egress 9/8 gov.xxx.ticket; was: Re: pls pls me 80/81...

2022-11-24 Thread Matthew Petach

Whoa!

Is it the start of April already?
I must have overslept last night, I could have sworn we just barely made it
to Thanksgiving!

Matt

On Tue, Nov 22, 2022 at 6:48 AM AQ Glass  wrote:

> anybody interested in this project?
>
> @oracle can own the .ticket tld; NS * ticket. -> virtualhost
>
> 96hr netflow + tech/biz/admin for adnetworks to reg their new xxx. with
> gov.xxx.ticket signup and further instructions
>
> //
>
> https://twitter.com/element9v/status/1579552246911885312?t=NdYwACGQsPkg0o0sHtIiqA=19
>
> codedevs can commit to
> https://github.com/element9v/takebackdarpa
>
> thx
> -e
>
> On Thu, Nov 17, 2022, 3:50 PM AQ Glass  wrote:
>
>>
>> https://twitter.com/element9v/status/1592162934658334720?t=f1cpwD_kr3wwIVlnrzCT4A=19
>>
>> #routetonull discuss
>>
>> #takebackdarpa
>>
>> -e
>>
>

Re: Alternative Re: ipv4/25s and above

2022-11-23 Thread Matthew Petach

On Tue, Nov 22, 2022 at 8:26 PM Abraham Y. Chen  wrote:

> Dear Tom: *
>
[...]

>
> 2)   "...Your proposal appears to rely on a specific value in the IP
> option header to create your overlay": Not really, as soon as the
> 100.64/10 netblock is replaced by the 240/4, each CG-NAT module can
> serve a very large area (such as Tokyo Metro and such) that becomes the
> RAN in EzIP terminology. Since each RAN is tethered from the existing
> Internet core by an umbilical cord operating on one IPv4 public address,
> this is like a kite floating in the sky which is the basic building
> block for the overlaying EzIP Sub-Internet when they expand wide enough
> to begin covering significant areas of the world. Note that throughout
> this entire process, the Option Word mechanism in the IP header does not
> need be used at all. (It turns out that utilizing the CG-NAT
> configuration as the EzIP deployment vehicle, the only time that the
> Option Word may be used is when subscribers in two separate RANs wishing
> to have end-to-end communication, such as direct private eMail exchanges.)
>

Hi Abraham,

I notice you never replied to my earlier questions about EzIP deployment.
I'll assume for the moment that means my concerns were without merit, and
will leave them aside.

But in reading this new message, I find myself again rather confused.

You stated:
"Since each RAN is tethered from the existing Internet core by an umbilical
cord operating on one IPv4 public address,"

I find myself staring at that statement, and puzzling over and over again
at how multi-homing would work in the EzIP world.

Would a given ISP anycast their single global public IPv4 address
to all their upstream providers from all of their edge routers,
and simply trust stable routing in the DFZ to ensure packets arrived
at the correct ingress location to be mapped from the public internet
into the RAN?

Or do you really mean that every RAN will have one giant single point
of failure, a single uplink through which all traffic must pass in order to
reach the DFZ public internet?

If your regional network is a housing subdivision, I can understand the
model of a single uplink connection for it; but for anything much larger,
a single uplink seems like an unsustainable model.  You mention Tokyo Metro
in your message as an example.  What size single uplink do. you think would
be sufficient to support all the users in the Tokyo Metro region?  And how
unhappy would they be if the single router their 1 public IP address lived
on happened to have a hardware failure?

Wouldn't it be better if the proposed model built in support for
multihoming from day one, to provide a similar level of redundancy
to what is currently available on the Internet today?

Or is EzIP designed solely for small, singled-homed residential
customers, and is not intended at all for enterprise customers
who desire a more resilient level of connectivity?

As I noted in my previous message, this seems like an awful lot of
work to go through for relatively little benefit--but this may simply be
due to a lack of essential clue on my part.  I would very much like to
be enlightened.

Thank you!

Matt

Re: Alternative Re: ipv4/25s and above

2022-11-20 Thread Matthew Petach

On Fri, Nov 18, 2022 at 7:53 PM Abraham Y. Chen  wrote:

> Dear Owen:
>
> 1) "... Africa ... They don’t really have a lot of alternatives. ...":
> Actually, there is, simple and in plain sight. Please have a look at the
> below IETF Draft:
>
>
> https://datatracker.ietf.org/doc/html/draft-chen-ati-adaptive-ipv4-address-space

Hi Abraham,

I know I'm not the sharpest tool in the shed, but I'm having some
trouble understanding the deployment model for EzIP.  Perhaps you
could help clear it up for me?

A non-EzIP web server is only going to see the global destination
IP address and TCP port number as the unique session identifiers
for communication, so the vast amount of additional IP space you
postulate existing behind the SPR functionally collapses down into
the 64K TCP port range available today in traditional port-based NAT
setups.  As long as the top 50 websites aren't EzIP-aware, there appears
to be no benefit for an ISP to deploy EzIP, because it doesn't actually
gain them anything beyond what existing CG-NAT devices already provide
as far as their web-browsing customer base is concerned.  Most of their
communication will still be port-based-NAT, with all the headaches and
restrictions inherent in that.

And for the top 50 websites, there's a lot of expense and absolutely no
upside
involved in deploying EzIP.  They don't care about how much IP space you
have
behind your NAT device, and whether it's uniquely identifiable within your
local
realm; it's not information they can access for tracking users, as they
don't know
what your internal mapping is, so they'll continue to rely on browser
cookies and
the like for tracking actual user sessions, regardless of the IP addressing
format
being used.

So, you've got a model where there's functionally almost no benefit to the
end user
until the major websites start deploying EzIP-aware infrastructure, and
there's
pretty much no incentive for major websites to spend money upgrading their
infrastructure to support EzIP, because it provides no additional benefit
for them.

This is pretty much exactly the same conundrum that IPv6 faced (and still
faces
today).  I don't see why EzIP is any better than IPv6 or CG-NAT in terms of
solving
the IPv4 address shortage problem, and it seems to involve more expense for
web
providers, because it requires them to deploy additional SPR mapping
routers into
their infrastructure in order to use it, with essentially no additional
benefit.

Is there a piece to the picture I'm not understanding correctly?

Thanks!

Matt

Re: BCP38 For BGP Customers

2022-11-08 Thread Matthew Petach

On Tue, Nov 8, 2022 at 8:44 AM Grant Taylor via NANOG 
wrote:

> [...]
>
> I don't understand why you would want to allow packets that couldn't
> return the same path.
>
> As for asymmetrically routed packets, I would still expect a return path
> to exist, even if it's not utilized.
>
>
Grant,

You're thinking about it from the upstream perspective, where a route
could be accepted but depreferenced and thus not actively used.
Think about it from the downstream network's perspective, though.
If you're my upstream, and I don't want to use your link for inbound
traffic, but I'd like to be able to send out some traffic over the link,
how can I advertise the prefix to you in a way that would both ensure
that you have it in your table locally, so that uRPF is happy, but also
to ensure no packets actually make *use* of that routing table entry?
Sure, you could tag the routes with 'no-export', but that only prevents
the prefix from propagating outward, it doesn't prevent traffic on that
router from using the routing table entry.  You can try adjusting your
MED, and hope the upstream doesn't squash the MED back to a default
value it applies to all its customers.  For the most part, you're up
against
a wall.  You don't know how your upstream's route selection process is
stacked with respect to routes you advertise, so you have no certainty that
if you announce a prefix to them, it won't potentially be used to carry all
your inbound traffic.
The only way to be sure that you won't take inbound traffic on a link is to
not advertise the prefix at all across that link.

Why might this be necessary, you ask?

Imagine you've got links of different sizes on your network.
You have a 1G link to provider A
You have a 1G link to provider B
You have a 10G link to cheap provider C
You have a 10G link into a peering exchange

Somewhere beyond provider A, someone decides they don't like one of your
customers, and sends a 5Gbps DDoS flow at you.

If you continue advertising that prefix to provider A, everyone going
through
provider A will suffer, including all their customers.  You have plenty of
capacity
to take the inbound flow through the exchange point and through cheap
provider C.

So, you stop advertising the prefix under attack to provider A and provider
B, to ensure
the traffic doesn't saturate your 1G links.

Inbound traffic happily shifts to the exchange point port and provider C's
port.
Life is good.

Oh no!  Provider A has implemented uRPF, and now all their customers are
unhappy
because they can't reach your websites on that prefix, because the *return*
traffic
is still flowing out the 1G link directly to provider A.
Trying to implement a source-based routing filter to redirect all traffic
*coming* from
the prefix under attack destined to ISP A to instead go through ISP C is a
pain in the
butt.  So, you grudgingly stop accepting routes from ISP A, as that's the
only way to
make the uRPF pain stop.

Now, none of your traffic is flowing out the 1G link to ISP A;  their
customers are happy
again, because they can reach websites on your prefix that is under attack
(via ISP C,
or the exchange point).

At the end of the month, you look at your contracts, and realize that you
had to spend
a chunk of your limited engineering resources working around the upstream
uRPF filter,
and ultimately ended up not sending traffic across a link you were paying
for.

When renewal time comes, you decide it's not worth the headache to pay for
a link to
ISP A that you can't reliably use, and that requires manual intervention to
work around
whenever creative routing solutions are necessary.  You don't renew your
contract with
ISP A.

As Mr. Bush might say,
"I recommend all my competitors implement this in their networks."

Thanks!

Matt

Re: Newbies Question: Do I really need to sacrifice Prefix-aggregation to do BGP Load-sharing?

2022-10-20 Thread Matthew Petach

On Thu, Oct 20, 2022 at 6:23 AM Jon Lewis  wrote:

> [...]
> While writing this though, two things occurred to me.
>
> 1) Are there any networks with routing policy that looks at prepends and
> says "if we see a peering path with >X number of prepends (or maybe
> just path length >X), demote the localpref to transit or lower"?  "i.e.
> They obviously don't want us using this path, turn it into a backup
> path."
>
> 2) Particularly back when it was found some BGP implementations broke when
> encountering unusually long as-paths, I think it became somewhat common
> to reject routes with "crazy long" as-paths.  If such policy is still
> in place in many networks, excessive prepending would actually have the
> desired effect for those networks.  i.e. The excessive prepends would
> get that path rejected, keeping it from being used.
>

At a previous job, I explicitly crafted policies that were structured such
that:

if PREFIXLENGTH > MAXPREFIXLENGTH then reject
if ASPATH > MAXASPATH then reject
strip_internal_communities
if ASPATH > MAX_VALID_PATH then
   set localpref = TRANSIT_DEPREF_LOCALPREF
   set communities DEPREF_TRANSIT
   blah blah blah
if match external_signal_communities then
  set localpref
  set internal propagation communities
  set external propagation communities
  blah blah blah
then accept

that way, if the prefix size is too small, or the aspath is too long
(>100),
it gets dropped before even bothering to evaluate communities; save
every bit of CPU and memory you can.
Then, strip your internal communities off everything else that's a
reasonable
path length;
set a lower threshold for what you consider a "reasonable" internet
diameter
to be, including a reasonable 3x prepend at one or two levels; if it's
longer than
that, it's a backup path at best, treat it that way (below standard transit
level)
finally, on all the remaining routes, evaluate your external signalling
communities,
and apply internal signalling communities as appropriate, and process
normally.

There's a clear tradeoff between trying to ensure maximum reachability
to the rest of the internet versus protecting your CPU and memory from
unnecessary work and state-keeping.  As mentioned in another thread,
what each network decides the MAXPREFIXLENGTH is will depend on
their relationships and the capabilities of their hardware.  It doesn't
necessarily
have to be /24 and /48, but it should be set at the longest value your
network
can happily support, unless you want to chase down odd connectivity issues
in other people's networks.  ^_^;

Thanks!

Matt

2022.10.19 NANOG86 community meeting notes

2022-10-19 Thread Matthew Petach

For anyone who missed the community meeting this morning at NANOG 86
and wants to know what was covered before the official notes and video are
available,
here's the notes I took from the meeting this morning.

And if you missed Geoff Bennett's talk on how optical networking
transformed
our world, you definitely need to see it when it gets posted.  Absolutely
worth
the price of admission!

Thanks!

Matt

2022.10.19 NANOG86 community meeting

MODERATORS: Cat Gurinsky, Ed McNair

NOTES:
Cat reminds people to identify themselves when coming to the 
microphone, and the slide deck is posted in the agenda.

Over to Ed to present the slides.

Ed welcomes everyone back to the first meeting that 
really feels like a traditional meeting again.
It was a slow march to come back, and thank you 
all for coming back together again with us!

Development updates:
Greg Newman joins as full-time employee; he has been
helping us working for a third party on our development
for the past three years, and now he's coming on board.
He does an incredible job communicating with the team,
and has been instrumental in reimagining our program 
tools.

2022 development projects
migrate membeship and donations to stripe
iprove test coverage
new forms for talk submissions
cluster rebuild
upgrade stripe to payment intents
appointment tool beta
dozens of bug fixes and tweaks
PCTool v2

Ed demos the new appointment tool;
first, we reimagined our registration
system; then we came up a with a new 
tool for connecting with other attendees;
when you register, you can make yourself
available for appointments.  When you click 
on the little symbol, you can schedule a 
meeting with people; it sends you an email
with the invite, you can adjust the times
for meetings, and chat back and forth with
attendees to lock it in;
You can view the calendar view as well.

2023 development projects
update site navigation and UI, and staff
administration tools
doing some UI refreshes
update the registration system
sponsor tool update, saving staff time
badge printing?
appointment tool v1 will be released
updates to the virtual platform, ability to 
 communicate 1 on 1
add meeting room/table assignments into meeting tool
interactive reporting tools, add feedback
enable NOGs to use our event tools.
 looking for people to help support us!

Outreach things we've done in past few months;
we have an outreach group to expand nanog beyond
just the meetings.
NANOG U: Montgomery, collaboration between NANOG,
ICANN, ARIN, and our outreach partners; hosted by
the city of montgomery's mayer's office;

Over to Cat for the rest.
NANOG program
Evolution of the PC;
added subcommittees:
 Data analysis sub-committee, started by Liz
 moderators sub-committee
  some people have never done public speaking; helps 
  people through, and write the moderator speeches for 
  people so it's no longer ad-hoc
broken off sub-committees (no longer under the PC)
 education
 outreach
improvements to talk submission process
 assigning of content reviewers earlier in the process,
 makes sure it's readable, presentable; we help you 
 make a good-looking presentation even if it's not
 accepted.

Evolution of the PC tool
 much easier to work with tool, filter by category, 
 you can see what stage each talk is in, make sure 
 slides are attached, 

Agenda builder--just written by greg; used to be in 
an excel spreadsheet with title and length, with 
merged cells.
Now, using same base as appointment builder to 
build the agenda; talks are already right size 
to put into the calendar view.

Rolling call for presentations
submit now for next several nanogs, up to N89
87, atlanta, GA
88, seattle WA
89, san diego, CA

We can even accept talks for future NANOGS, 
even if it's 2 meetings away; if it's a good
talk, it's good even if it's out ahead.
Your final slides are due before the conference,

N86 and N87 talk submissions
N86, 73 submitted talks total
24 accepted talks, +2 lightning talks

N87, 12 pending talks, 1 accepted talk

tutorals and tracks room back for N87 and onwards

some N87 talks were deferred from N86 that 
weren't going to be ready in time, or people 
weren't going to be available.

we need to accept more talks to fill the 
tutorials and tracks room, so start submitting,
we're looking for 30+ talks.

If you wait until last minute, there's less likely
to be space for you.

Program Committee Volunteers
starting in January, members can submit name for 
consideration, 2 year terms, 2 consecutive terms
14 positions available, 5 to replace term-limited
PC members
appointments will be made in February.

Thank you!

QA
Luis Lee, Google Fiber; thank you for the talk;
got me thinking about tutorials; do you maybe 
want to do a series of soft skills;
how to give a technical presentation
to a room this size, or to senior management,
what you want to do, (deploy IPv6, for example)
as part of our career building.  Soft skills 
are important to put a team together to build
things, to get things done.

Re: any dangers of filtering every /24 on full internet table to preserve FIB space ?

2022-10-16 Thread Matthew Petach

On Tue, Oct 11, 2022 at 7:03 PM William Herrin  wrote:

> On Tue, Oct 11, 2022 at 5:32 PM Matthew Petach 
> wrote:
> [...]
> All TCP/IP routing is more-specific route first. That is the expected
> behavior. I honestly don't fathom your view that BGP is or should be
> different from that norm. If the origin of a covering route has no
> problem sinking the traffic when the more-specific is offline, I don't
> see the problem. You shouldn't be taking them offline with route
> filtering.
>

*facepalm*

Right.  That's the entire point I started off the subthread with.

The problem lay with an organization that *did* have a problem
sinking the traffic when the more-specific was not available.
They had chunked up their allocation into smaller pieces
which were distributed to different island locations with no
internal network connectivity to the island sites.

They were announcing a covering prefix for all the more
specifics, where the covering less specific announcement
had no reachability to the more specifics; so when a network
filtered out the more specifics, the traffic fell on the floor, because
it was sent to a location that was announcing the supernet that
had no reachability to the correct destination.

Their assumption that *everyone* would hear the more specifics,
and thus the traffic would flow to the right island location was the
"failure to understand BGP" that I was commenting on, and noting
that while it is entirely correct to decide if you want to filter prefixes
of an arbitrary length from entering your network, you may discover
in the process that other networks that do not understand BGP and
routing in general may complain that you have Broken The Internet(tm)
by doing so.

Assuming that your announcement of more specifics will always pull
traffic away from a less-specific announcement is overly-optimistic.
While it may *often* work, you should still be prepared to deal with
traffic arriving at your least-specific announcement as well.

This turned out to be something that not every network on the
Internet fully grasps, and my original message was warning that
filtering on /24s would potentially bring complaints from networks
like those.

It took a roundabout path, but I'm glad we eventually both ended
up at the same place.   :)

Thanks!

Matt

Re: any dangers of filtering every /24 on full internet table to preserve FIB space ?

2022-10-11 Thread Matthew Petach

On Tue, Oct 11, 2022 at 1:59 PM William Herrin  wrote:

> On Tue, Oct 11, 2022 at 1:15 PM Matthew Petach 
> wrote:
> > Wouldn't that same argument mean that every ISP that isn't honoring
> > my /26 announcement, but is instead following the covering /24, or /20,
> > or whatever sized prefix is equally in the wrong?
> >
> > What makes /24 boundaries magically "OK" to filter on,
>
> Hi Matthew,
>
> /24 is the consensus filtering level for Internet-wide routes and it
> has been for decades. It became the consensus as a holdover from
> "class C" and remains the consensus because too many people would have
> to cooperate to change it. Indeed, a little over a decade ago some
> folks tried to change it to /19 and then /20 for prefixes outside "the
> swamp" and, well, they failed. Likewise, more than a few folks
> announce /26's to their immediate transit providers and they simply
> don't move very deep into the system -- nobody has any expectation
> that they will.
>

Yes, I know.  I was there when smd was pointing out the arbitrary lines
being drawn in the sand, and decided to draw his own line.  The first salvo
was fired in 1996, with a customer complaining their /24 wasn't being
accepted by everyone, leading to a *very* long chorus of people chiming
in with different thoughts on where the line could and should be drawn:
https://archive.nanog.org/mailinglist/mailarchives/old_archive/1996-01/msg00306.html

My point is that it's not a feature of BGP, it's a purely human convention,
arrived at through the intersection of pain and laziness.
There's nothing inherently "right" or "wrong" about where the line was
drawn, so for networks to decide that /24 is causing too much pain,
and moving the line to /23 is no more "right" or "wong" than drawing
the line at /24.  A network that *counts* on its non-connected sites
being reachable because they're over a mythical /24 limit is no more
right than a customer upset that their /25 announcements aren't being
listened to.

> > To wrap up--I disagree with your assertion because it depends entirely
> > on a 'magic' /24 boundary that makes it OK to filter more specifics
> smaller
> > than it, but not OK to filter larger than that and depend instead on
> covering
> > prefixes, without actually being based on anything concrete in BGP or
> > published standards.
>
> Got any better reasons besides disliking the consensus?
>

Absolutely.

Let BGP work as it's supposed to work.

If there's a covering prefix being announced, according to BGP, it's a
valid pathway to reach
all the prefixes contained within it.  If that's not how your network is
constructed, don't
send out your announcements that way.  Only announce prefixes for which you
*do* have
actual reachability.

Consensus isn't a guarantee.

"SHOULD" in an RFC is still just a recommendation, and not following
it isn't an error.

If you're worried about memory in your routers, and you decide to move the
line from /24 to /23 or /22, that's not an error, that's not breaking BGP,
that's
just moving an arbitrary line that was set by stressed and busy network
engineers nearly 3 decades ago.

If a network engineer feels the need to filter out longer prefixes to deal
with
a memory shortage in their devices, that's their decision; my anecdote was
to point out you'll likely run into people who don't understand BGP very
well,
and mistakenly think there's some magical guarantee that /24 or shorter
prefixes will always work, while longer prefixes won't.  And that's just
not
at all true.  BGP simply looks for the longest match in the available table,
whatever that might be, and uses whatever the "most specific" match is,
no matter how long or short it might be.  Networks should always keep
that in mind when announcing prefixes; don't announce a prefix you aren't
prepared to handle the traffic for, no matter what traffic engineering
tweaks
you might be attempting to steer traffic away.  You should always assume
that for whatever reason, if you announce a prefix, there's a good chance
that other networks will see that as the best match and make use of it.
If you don't want it used for traffic, don't announce it.

Thanks!

Matt

Re: any dangers of filtering every /24 on full internet table to preserve FIB space ?

2022-10-11 Thread Matthew Petach

On Tue, Oct 11, 2022 at 7:41 AM William Herrin  wrote:

> On Mon, Oct 10, 2022 at 3:37 PM Matthew Petach 
> wrote:
> > They became even more huffy, insisting that we were breaking the
> internet by not
> > following the correct routing for the more-specific /24s which were no
> longer present
> > in our tables.  No amount of trying to explain to them that they should
> not advertise
> > an aggregate route if no connectivity to the more specific constituents
> existed seemed
> > to get the point across.  In their eyes, advertising the /24s meant that
> everyone should
> > follow the more specific route to the final destination directly.
>
> Hi Matthew,
>
> They were correct. If the /24 was reaching your network, traffic
> should not have been following the /20. In your version, they would
> have to disaggregate the /20 into 16 /24s just because you didn't want
> to honor most-specific path routing. That's not what anybody wants.
> Least of all you.
>

I disagree.

To illustrate why, let's take your case a step further, shall we?

Wouldn't that same argument mean that every ISP that isn't honoring
my /26 announcement, but is instead following the covering /24, or /20,
or whatever sized prefix is equally in the wrong?  And what about Fred's
/27 announcement?  Gosh, and now Cindy wants to announce a dozen
/30's--is it everyone else's error for not listening to those announcements?

What makes /24 boundaries magically "OK" to filter on, such that if
you announce something smaller than a /24 that gets filtered, and
traffic goes to the covering aggregate, everyone says "well, that's
just how the Internet works, and of course traffic would be expected
to flow towards the covering announcement", but if I set the boundary
at a different, but still arbitrarily-sized point, like /23, suddenly the
announcing party is right, and I'm wrong?

If the stance is "it doesn't matter if there's a covering prefix, that
announcement doesn't mean you can reach all the prefixes
contained within it, you *must* listen to all the smaller announcements
in order to have reachability", then A) you're redefining how BGP works
in a fundamental way, and B) we should all buy stock in router memory
manufacturers, because they're going to be the next oil companies.

BGP 101 says that if I announce a covering prefix, I'm making a statement
into the BGP routing table that says "you can reach everything contained
within this covering route via me", and that's how the forwarding tables
treat it; any time there's nothing more specific in the table, even due to
a brief transient change on the Internet, traffic for those prefixes will be
forwarded to the router announcing the covering prefix announcement.

If I announce 0/1 into the DFZ and drop any traffic destined for it on the
floor, I'm not going to get much sympathy by saying "well, it's your fault,
you should have been listening to all the more specifics and not trusting
the covering route to actually have reachability to the prefixes contained
within it."  (though that does make me think that if you're a content-heavy
shop looking to balance your traffic flows, it might be a interesting way
to make the point in a very real way to everyone on the Internet...)

To wrap up--I disagree with your assertion because it depends entirely
on a 'magic' /24 boundary that makes it OK to filter more specifics smaller
than it, but not OK to filter larger than that and depend instead on
covering
prefixes, without actually being based on anything concrete in BGP or
published standards.

"But that's how we've always done it" is not the same as "but that's how
the protocol works."   ^_^;

Regards,

Bill Herrin

Thank you for the discussion!

Matt

Re: any dangers of filtering every /24 on full internet table to preserve FIB space ?

2022-10-10 Thread Matthew Petach

On Mon, Oct 10, 2022 at 8:44 AM Mark Tinka  wrote:

> On 10/10/22 16:58, Edvinas Kairys wrote:
>
> > Hello,
> >
> > We're considering to buy some Cisco boxes - NCS-55A1-24H. That box has
> > 24x100G, but only 2.2mln route (FIB) memory entries. In a near future
> > it will be not enough - so we're thinking to deny all /24s to save the
> > memory. What do you think about that approach - I know it could
> > provide some misbehavior. But theoretically every filtered /24 could
> > be routed via smaller prefix /23 /22 /21 or etc. But of course it
> > could be a situation when denied /24 will not be covered by any
> > smaller prefix.
>
> I wouldn't bank on that.
>
> I am confident I have seen /24's with no covering route, more so for PI
> space from RIR's that may only be able to allocate a /24 and nothing
> shorter.
>
> It would be one heck of an experiment, though :-).
>
> Mark.
>

I may or may not have done something like this at $PREVIOUS_DAY_JOB.

We (might have) discovered some interesting brokenness on the Internet in
doing so;
in one case, a peer was sending a /20 across exchange peering sessions with
us,
along with some more specific /24s.  After filtering out the /24s, traffic
rightly flowed
to the covering /20.  Peer reached out in an outraged huff; the /24s were
being
advertised from non-backbone-connected remote sites in their network, that
suddenly
couldn't fetch content from us anymore.  Traceroutes from our side followed
the /20
back to their "core", and then died.  They explained the /24s were being
advertised
from remote sites without backbone connections to the site advertising the
/20, and
we needed to stop sending traffic to the /20, and send it directly to the
/24 instead.
We demurred, and let them know we were correctly following the information
in the
routing table.
They became even more huffy, insisting that we were breaking the internet
by not
following the correct routing for the more-specific /24s which were no
longer present
in our tables.  No amount of trying to explain to them that they should not
advertise
an aggregate route if no connectivity to the more specific constituents
existed seemed
to get the point across.  In their eyes, advertising the /24s meant that
everyone should
follow the more specific route to the final destination directly.

So, even seeing a 'covering route' in the table is no guarantee that you
won't create
subtle and not-so-subtle breakage when filtering out more specifics to save
table space.   ^_^;

Having (possibly) done this once in the past, I'd strongly recommend
looking for a
different solution--or at least be willing to arm your front-end response
team with
suitable "No, *you* broke the Internet" asbestos suits before running a git
commit
to push your changes out to all the affected devices in your network.   ;)

Matt

Re: email spam

2022-08-24 Thread Matthew Petach

On Wed, Aug 24, 2022 at 7:28 AM Jawaid Bazyar 
wrote:

> "flawlessly map IP address to GPS coordinates"


Thanks, I needed a good hearty belly laugh to start off the day today.  ;P

*hint*
It's easier to fix the spam problem than it is to map IP addresses to
physical locations in reality-land.

This is one case where xkcd got it wrong.

https://imgs.xkcd.com/comics/tasks.png

Matt

Turkish aphorisms versus Latin aphorisms on the v6 internet

2022-08-20 Thread Matthew Petach

On Mon, Aug 15, 2022 at 8:22 AM VOLKAN KIRIK  wrote:

> the lesson of (/for) the he.net
>
> kolayın tuzağına düşme
>
> do not fall into trap of the easy way.
>
> now tier1 league of them will take like forever for never
>

I would suggest this is more aptly a case of the Latin phrase
"caveat emptor"

What Cogent or Hurricane chooses to do or not do with their
respective networks is their business (quite literally).

What matters is what *you* choose to do as a customer;
and that is why the Latin phrase is so relevant to your
situation:  "Let the buyer beware."

Knowing now as you do that neither network will give
you a "complete" view of the Internet, when it comes
time to renew contracts, it would seem to be wise for
you to select a set of upstream providers that will better
suit your needs.  Using Cogent or HE as one of your
transit providers, along with a less contentious network
will help ensure you have broader coverage for what the
rest of the world considers to be the 'entire' internet.

Putting all of your eggs in either the Cogent or HE baskets
at this stage of the game, however, is likely to lead to some
amount of heartache and pain.

In the immortal words of the Knight Templar guarding the
chalice:
https://www.youtube.com/watch?v=Ubw5N8iVDHI

"He chose...poorly."

The reason you're not getting a great deal of sympathy or
support here, no matter how many times you grumble about
it, is that this is a known conflict, it's been in existence for as
long as some of the people on this list have been working in
the industry, and everyone else has simply shrugged their
shoulders, learned to do more background research before
signing a long-term contract, and moved on with their
businesses.

The only thing that is likely to break the stalemate at this
point is voting with your money.  Take your business elsewhere,
until one side or the other decides getting customer revenue is
more important than a silly entry on a Wikipedia page.  Or, if you
have the resources, become a majority shareholder in one or the
the company, and force the management team to change its
position on the matter.  Given that only one of the two companies
is publicly held, I suspect it should be clearly evident which company
would be the best focus of your energy and money, should you choose
to go in that direction.

If nothing else, please do recognize that no amount of grumbling
here on NANOG is going to have one whit of impact on the position
of either company.   ^_^;

Best of luck!

Matt

Re: Verizon no BGP route to some of AS38365 (182.61.200.0/24)

2022-07-21 Thread Matthew Petach

holow29,

Usually decisions made about limiting route propagation outbound for
certain prefixes tend to happen along financial and operational
constraints.

Remember, outbound route announcements control inbound traffic
volumes.  So, if you've got some full pipes across the ocean, for example,
one way to limit the amount of traffic trying to go across
them is to stop announcing some high-traffic prefixes across them.

Note that it is almost always the route *announcer* that is subtracting
out routes to control traffic volumes; very seldom does the receiver
completely subtract out prefixes heard.  The more usual case is for
the receiver to de-preference the routes down, but to keep them in
their tables as a fallback, because they can control that behaviour.

>From the route-exporting network's perspective, however, they have
no reason to believe that the receiving network would listen to and
obey MEDs that they send, to shift traffic away from the congested
path.  They might have tried adding some prepends along the path,
only to discover the receiving network was using LOCALPREF knob
adjustments to control traffic, which kinda shits all over attempts
to traffic engineer via prepending, at which point the only knob you
have left to turn in order to move traffic off the pathway is to stop
announcing prefixes entirely.

As a not-so-parenthetical aside, this is why I *strongly* encourage
networks that use LOCALPREF as a control knob to limit the depth
to which LOCALPREF adjustments are made; ie, allow overriding
of the default LOCALPREF only to paths that are $AVG_PATH_LENGTH+2
or shorter.  That way, if a peer wants to signal a pathway is congested
and should not be used under normal circumstances, they can still do
that by triple-prepending their ASN.  That way, peers can continue to
announce prefixes, but still have some control over traffic flows via
as-path prepending, versus being backed into the corner of having to
completely stop announcing prefixes entirely across certain peering
sessions.

It is very likely that in this case, CT is not announcing prefixes to AS701
in order to control inbound traffic volumes across certain connections,
because CT has determined that is the only control knob they have
available that works; they have probably already experimented and
found that AS701 squashes inbound MEDs, and uses LOCALPREF
to force traffic flows regardless of as-path prepending.

Your only recourse in a situation like this is likely to be getting
a second connection that bypasses the CT-VZ choke point.

Best of luck!

Matt

On Thu, Jul 21, 2022 at 8:56 AM holow29  wrote:

> In this instance, I would define "correct" behavior as VZ having any route
> to this subnet; after all, customers don't pay VZ to access just their
> network or a subset of the internet. You make a good point, though I would
> expect that if it isn't VZ's business decision to not have this route, they
> would have some sort of vested interest in figuring out why CT is not
> advertising it to them.
> From VZ engineer, I heard that no one is advertising it to AS701, so I
> assume it is not a case of VZ refusing to accept it (which is what I had
> initially assumed having been frustrated in the past with VZ on other
> matters + seeing all their BGP issues in the past few years).
>
> On Thu, Jul 21, 2022 at 11:40 AM Tom Beecher  wrote:
>
>> I would expect Verizon to be able to contact CT and figure out why they
>>> aren't passing the *correct* routes just as I might expect Baidu to do
>>> the same thing, I suppose.
>>>
>>
>> What defines 'correct'? ASN's routinely make traffic engineering or
>> business decisions not to announce certain prefixes to certain other ASNs.
>> This certainly does sometimes cause reachability issues for end users, but
>> that's a choice someone made. That's part of why it's generally good
>> practice to get more than 1 upstream if you can; it gives you more control
>> to mitigate the impact of these choices.
>>
>> It is completely possible that Baidu or CT are intentionally not
>> announcing prefixes to VZ. It is also completely possible that they are and
>> VZ is not accepting it.
>>
>>
>>
>> On Thu, Jul 21, 2022 at 11:24 AM holow29  wrote:
>>
>>> I would expect Verizon to be able to contact CT and figure out why they
>>> aren't passing the correct routes just as I might expect Baidu to do the
>>> same thing, I suppose. Ultimately, whose responsibility is it other than
>>> CT? That is my question. Maybe in this instance, it is common in the
>>> industry for this to be the responsibility of Baidu. That seems
>>> counterintuitive to me as it is Verizon without the proper route
>>> ultimately, but I am not an expert.
>>> Certainly, I think it is incorrect to ask the customer to try to resolve
>>> these issues after bringing it to the attention of these services. If
>>> Verizon couldn't reach one of Google's edge servers, would it be Verizon or
>>> Google's responsibility to fix that if the issue were an

Assumptions about network designs...

2022-07-11 Thread Matthew Petach

On Mon, Jul 11, 2022 at 9:01 AM Andrey Kostin  wrote:

> It's hard to believe that a same time maintenance affecting so many
> devices in the core network could be approved. Core networks are build
> with redundancy, so that failures can't completely destroy the whole
> network.

I think you might need to re-evaluate your assumption
about how core networks are built.

A well-designed core network will have layers of redundancy
built in, with easy isolation of fault layers, yes.

I've seen (and sometimes worked on) too many networks
that didn't have enough budget for redundancy, and were
built as a string of pearls, one router to the next; if any router
in the string of pearls broke, the entire string of pearls would
come crashing down, to abuse a metaphor just a bit too much.

Really well-thought out redundancy takes a design team that
has enough experience and enough focused hours in the day
to think through different failure modes and lay out the design
ahead of time, before purchases get made.Many real-world
networks share the same engineers between design, deployment,
and operation of the network--and in that model, operation and
deployment always win over design when it comes time to allocate
engineering hours.  Likeise, if you didn't have the luxury of being
able to lay out the design ahead of time, before purchasing hardware
and leasing facilities, you're likely doing the best you can with locations
that were contracted before you came into the picture, using hardware
that was decided on before you had an opportunity to suggest better
alternatives.

Taking it a step further, and thinking about the large Facebook outage,
even if you did well in the design phase, and chose two different vendors,
with hardware redundancy and site redundancy in your entire core
network, did you also think about redundancy and diversity for the
O side of the house?   Does each redundant data plane have a
diverse control plane and management plane, or would an errant
redistribution of BGP into IGP wipe out both data planes, and both
hardware vendors at the same time?  Likewise, if a bad configuration
push isolates your core network nodes from the "God box" that
controls the device configurations, do you have redundancy in
connectivity to that "God box" so that you can restore known-good
configurations to your core network sites, or are you stuck dispatching
engineers with laptops and USB sticks with configs on them to get
back to a working condition again?

As you follow the control of core networks back up the chain,
you ultimately realize that no network is truly redundant and
diverse.  Every network ultimately comes back to a single point
of failure, and the only distinction you can make is how far up the
ladder you climb before you discover that single point of failure.

Thanks!

Matt

Re: FCC BDC engineer?

2022-07-05 Thread Matthew Petach

On Tue, Jul 5, 2022 at 11:52 AM Bryan Fields  wrote:

> On 7/5/22 1:58 PM, Andrew Latham wrote:
> > I read https://docs.fcc.gov/public/attachments/DA-22-543A1.pdf and a PE
> is
> > not required.
>
> I'd agree.
>
> 47 CFR § 1.7004(d)
> "All providers also shall submit a certification of the accuracy of its
> submissions by a qualified engineer. The engineering certification shall
> state
> that the certified professional engineer or corporate engineering officer
> is
> employed by the provider and has direct knowledge of, or responsibility
> for,
> the generation of the provider's Digital Opportunity Data Collection
> filing."


> Note the lack of capitalization of "qualified engineer".  This means it is
> not
> defined in that part, and leaves it open to interpretation.
>

One could even meet the requirement by focusing on the second clause:

"The engineering certification shall state
that the certified professional engineer *or corporate engineering officer*
is
employed by the provider and has direct knowledge of, or responsibility for,
the generation of the provider's Digital Opportunity Data Collection
filing."
(emphasis mine)

So, if you appoint a Corporate Engineering Officer that is employed by the
provider and has responsibility for the generation of the DODC filing,
you've met the requirements without a need for a certified professional
engineer.

Matt

Re: Serious Juniper Hardware EoL Announcements

2022-06-14 Thread Matthew Petach

On Tue, Jun 14, 2022 at 9:38 AM Adam Thompson 
wrote:

> [Not specific to the Juniper EoLs...]
>
> I sort of agree with Mark:
>
> I've been sampling a fairly wide variety of sources in various parts of
> the global supply chain, and my synthesis of what they're saying is that we
> probably won't *consistently* have the ready availability of "stuff" (both
> electronic and not) we had pre-pandemic, for the rest of my career
> (10-15yrs), and maybe not in the lifetimes of anyone reading this today,
> either.
>

For those who may have forgotten:

https://cacm.acm.org/news/257742-german-factory-fire-could-worsen-global-chip-shortage/fulltext

That was the *sole* supplier of extreme ultraviolet lithography machines
for every major chip manufacturer on the planet.

Chip shortages will only get worse for the next several years.  The light
at the end of the tunnel is unfortunately *not* coming from an ultraviolet
lithography machine.  :(

Matt

2022.06.08 NANOG85 community meeting notes

2022-06-08 Thread Matthew Petach

For members of the broader NANOG community who may not have been able to
see the community meeting that happened this morning, I jotted down some
notes on what was discussed. Much of the content was directly from the
slide deck

https://storage.googleapis.com/site-media-prod/meetings/NANOG85/4478/20220608_Mcnair_Nanog_85_Community_v1.pdf

I attempted to also include the back-and-forth between the presenters and
the audience
during the question and answer period as well.

Thanks!

Matt

2022.06.08 NANOG85 community meeting

NOTES:

Steve Meuse kicks things off at 0702 hours

Pacific time.

Ed and Cat run the community meeting this

morning.

Ed welcomes everyone to the third day of

NANOG.

First thing is the registration fee increase.

We're going to have to push our costs forward

to get us moving towards the black again.

early bird fee will be $675 versus late fee of $875.

IETF early bird is $700, same for our non-member

price, but we feed you!

We think that's an important part of building

community.

Development projects; big project we've been

working on, that's the appointment tool.

It's currently in beta test with a limited pool

of volunteers, the board and the program committee,

and was demoed on May 26th.

There is a dedicated slack channel for the beta

testers to give feedback.

public beta will be released when NANOG86

registration opens up.

Affinity groups:

in the opening presntation, Tina talked about

affinity groups.

To expand our community and help bind it together,

community.nanog.org has affinity groups;

so you can self-gather in areas of common interest;

running, coffee, LGBTQ+, women in tech, etc.

You can also suggest new areas of affinity.

This morning, a group went walking, organized

through the walking affinity groups.

when you land on community.nanog.org;

you need nanog account, it's free to sign up,

it uses Oauth2 for authentication.

We have a mirror of the mailing list there

as well; it's read only, lets you see how

many replies there are, etc.

it grabs captions from websites for you.

off to the right, shows timeline of communication

that goes along the thread, and major contributors;

it shows what the mailing list has in graphical

form.

The affinity groups work the same way; you can

communicate via email or directly through the

forum page.

NANOG college immersion program has been

restructured; more targeted to the most

qualified students; will be sponsor-based,

qualifications will be similar to the

scholarship program, and graduate students

can apply individually. Undergrads need to

come with a professor who can bring up to

five undergrad students. The professor

will chaperone and guide the students through

the program.

We had students from Howard attend this meeting.

NANOG84 ombuds report; we have third party

ombuds that listen to issues and concerns

people have, to ensure attendees feel welcome

with our community. Previously, issues came

to Ed directly, which didn't work if you had

an issue with Ed himself. By having a third

party fulfill the role, it eliminates that

issue.

Look at resources, down at Ombudsman you can

click the link for the full report.

Our most recent grade was 71/100, so we have

some work to do.

He spoke to the Howard NCI students; they

commented they wish they had found it

sooner, it would have had a big impact

on their college trajectory.

Annual report is almost done; was hoping

to have it ready, but things got busy;

will email out when it is done.

Program Committee updates.

congrats to them, they have done a

stellar job with this program, starting

from an effort several years ago to

provide a better experience for our

community.

Cat Gurinsky takes over, chair of the

program commitee.

How we got here? We've been doing a

rolling call for presentation; now you

can submit talks for several nanogs ahead.

we can accept sooner, so you can get business

travel arranged, visas, etc. and we can publish

the agenda much sooner.

Now, you can submit all the way through NANOG89.

We already have 4 talks accepted for N86 in

hollywood; there are 12 talks pending acceptance.

life cycle;

shepherd guides you through the process of submission,

from submission to peer review and acceptance; they

help give suggestions.

The Copy Reviewers ensures NANOG guidelines are

met, cleans up issues, makes sure fonts are large

enough to read from the back of the room.

we have rolling call for proposals, receive

proposals, shepherd is assigned, draft slides

are due, peer review begins, final slides are

due, agenda is published.

draft slides don't have to be fully fleshed out,

but it needs to be good enough to vote on; 70%

is about right.

start with rough outline; they'll tell you if

it's good enough, and vote, and then help you

flesh it out.

your shepherd is your advocate, we want you

to succeed!

With rolling submissions, waiting may mean you

miss out because

Re: gitlab contact?

2022-04-07 Thread Matthew Petach

On Thu, Apr 7, 2022 at 9:57 AM Dave Taht  wrote:

> Most cloud operations websites are kept internal. gitlab's is not,
> which is pretty cool. In looking over this issue, today:
>
> https://gitlab.com/gitlab-com/gl-infra/production/-/issues/6768
>
> They are tracking tcp syn retransmits, but not drops or other
> congestion control related info. And
> also using:
>
> sysctl net.ipv4.tcp_notsent_lowat=4294967295;
>
> where we've been getting good results with that set as low as 32k.
> Anyone know anyone at gitlab?
>

It looks like that's their current setting, but the test they're
running will be to drop it to 16K:
sysctl net.ipv4.tcp_notsent_lowat=16384;

It'll be interesting to see what the results of the
test are, and whether 16K becomes the new "normal"
for them.   :)
(cool to see this--I spent a certain amount of my time at
my previous job doing kernel parameter tuning for large
scale services, so seeing what values others are testing
with is good validation.   :)

Matt

Is it time to bring back the IPv6-only hour (well, half hour)?

2022-04-04 Thread Matthew Petach via NANOG

On Thu, Mar 31, 2022 at 2:01 PM Mark Andrews  wrote:

> You have to try running IPv6 only occasionally to weed out the
> dependencies.  You can do this on a per node basis.  Just turn off the IPv4
> interface and see how you run. I do this periodically on my Mac and disable
> IPv4.  This also makes my recursive nameserver IPv6 only as well.  You then
> see what breaks like sites where one of the cdn’s is IPv4 only despite the
> page itself being reachable over IPv6. Or the nameservers are not reachable
> over IPv6.
>
> Write down what you find is broken and report it.
>
> --
> Mark Andrews
>

This reminds me of days gone by, when NANOG used to have an IPv6-only hour
in the agenda, where IPv4 connectivity would be turned off, so people could
identify problem areas.

Unfortunately, it tended to mostly be an excuse to head to the coffee bar,
or enable "offline" mode in your mail client before it started, with little
active engagement in the room.

It might be interesting for NANOG86 in Hollywood to make it a formal part
of the agenda; not just an hour with no IPv4, but a focused half-an-hour in
which the focus of the room is on identifying problem areas; display an
anonymized "word cloud" on the screens in the room and remotely that people
can list sites, vendors, protocols, anything that they observe failing to
function from the point of view of an IPv6-only client.

We've talked about the need for people to "name-and-shame" in order to get
movement from some software and hardware vendors, but people are often
understandably reluctant to put their name on a 'name-and-shame' post that
could jeopardize their job.  Would doing it through an anonymized word
cloud give people more air coverage to list items they see that don't work
in an IPv6-only world?  (Clearly, there's limits; if you're the only
employee of a company, and you discover your employer's VPN endpoints don't
work from a v6-only network, you might think twice about listing it in the
word cloud--anonymization can only do so much to protect you!)

A forum leader at the microphone, making suggestions for services people
should test, functions they could try to exercise, sites they could try to
reach to start the ball rolling; and then as the word cloud starts to fill
in, solicit people's input on similar services to see if they fare any
better.  In fact, having two word clouds, red (doesn't work) and green
(does work) might be an even better idea, so that it's not just a
name-and-shame, but also a name-and-praise session, thanking those who have
done the work to make v6-only connectivity work, and calling out those who
still have work to do.

Or is this a ship that has already sailed, and attempting to resurrect it
will do nothing more than goose coffee sales for a brief interval?

Thoughts and feedback welcome!

Matt

Re: Let's Focus on Moving Forward Re: V6 still not supported re: 202203261833.AYC

2022-04-04 Thread Matthew Petach via NANOG

On Mon, Apr 4, 2022 at 10:41 AM Vasilenko Eduard via NANOG 
wrote:

> 240.0.01.1 address is appointed not to the router. It is appointed to
> Realm.
> It is up to the realm owner (ISP to Enterprise) what particular router (or
> routers) would do translation between realms.
>

Please forgive me as I work this out in my head for a moment.

If I'm a global network with a single ASN on every populated continent
on the planet, this means I would have a single Realm address; for
the sake of the example, let's suppose I'm ASN 42, so my Realm
address is 240.0.0.42.  I have 200+ BGP speaking routers at
exchange points all over the planet where I exchange traffic with
other networks.

In this new model, every border router I have would all use the
same 240.0.0.42 address in the Shaft, and other Realms would
simply hand traffic to the nearest border router of mine, essentially
following a simple Anycast model where the nearest instance of the
Realm address is the one that traffic is handed to, with no way to do
traffic engineering from continent to continent?

Or is there some mechanism whereby different instances of 240.0.0.42
can announce different policies into the Shaft to direct traffic more
appropriately that I'm not understanding from the discussion?

Because if it's one big exercise in enforced Hot Potato Routing with
a single global announcement of your reachability...

...that's gonna fail big-time the first time there's a major undersea
quake in the Strait of Taiwan, which cuts 7/8ths of the trans-pacific
connectivity off, and suddenly you've got the same Realm address
being advertised in the US as in Asia, but with no underlying connectivity
between them.

https://www.submarinenetworks.com/news/cables-cut-after-taiwan-earthquake-2006

We who do not learn from history are doomed to repeat it...badly.   :(

Matt

Re: V6 still not supported

2022-04-02 Thread Matthew Petach

On Fri, Apr 1, 2022 at 6:37 AM Masataka Ohta <
mo...@necom830.hpcl.titech.ac.jp> wrote:

>
> If you make the stateful NATs static, that is, each
> private address has a statically configured range of
> public port numbers, it is extremely easy because no
> logging is necessary for police grade audit trail
> opacity.

Masataka Ohta
>

Hi Masataka,
One quick question.  If every host is granted a range of public port
numbers on the static stateful NAT device, what happens when
two customers need access to the same port number?

Because there's no way in a DNS NS entry to specify a
port number, if I need to run a DNS server behind this
static NAT, I *have* to be given port 53 in my range;
there's no other way to make DNS work.  This means
that if I have two customers that each need to run a
DNS server, I have to put them on separate static
NAT boxes--because they can't both get access to
port 53.

This limits the effectiveness of a stateful static NAT
box to the number of customers that need hard-wired
port numbers to be mapped through; which, depending
on your customer base, could end up being all of them,
at which point you're back to square one, with every
customer needing at least 1 IPv4 address dedicated
to them on the NAT device.

Either that, or you simply tell your customers "so sorry
you didn't get on the Internet soon enough; you're all
second class citizens that can't run your own servers;
if you need to do that, you can go pay Amazon to host
your server needs."

And perhaps that's not as unreasonable as it first sounds;
we may all start running IPv4-IPv6 application gateways
on Amazon, so that IPv6-only networks can still interact
with the IPv4-only internet, and Amazon will be the great
glue that holds it all together.

tl;dr -- "if only we'd thought of putting a port number field
in the NS records in DNS back in 1983..."

Matt

Re: DMARC ViolationAS21299 - 46.42.196.0/24 ASN prepending 255 times

2022-03-31 Thread Matthew Petach

On Thu, Mar 31, 2022 at 3:16 PM Joe Maimon  wrote:

>
>
> Joe Provo wrote:
> > On Fri, Mar 25, 2022 at 11:08:01AM +0300, Paschal Masha wrote:
> >> :) probably the longest prepend in the world.
> >>
> >> A thought though, is it breaking any standard or best practice
> procedures?
> >
> > That said, prepending pretty much anything more than your current view
> > of the Internet's diameter in ASNs is useless in practice. Cascading
> > effects are considered in
> > https://datatracker.ietf.org/doc/draft-ietf-grow-as-path-prepending/
> > where a decent low number (5) is propsed.
> >
> > Chers,
> >
> > Joe
> >
>
> So is there a good way to signal along with a BGP route that the
> originator of the route wants you to know that this route has very high
> suckage factor and even if you normally prefer your peers customers
> whatever, you should perhaps think twice about that for this route,
> cause its really last resort.
>
> Because as-path is an overloaded multimeaning traffic influencing hammer
> that has imprecise and frequently undesirable results. And if that were
> not the case, than discussions of its relative size compared to internet
> diameter would be much more relevant.
>
> Joe
>
>
Unfortunately, the reason crazy-long prepends actually propagate so
widely in the internet core is because most of those decisions to prefer
your peer's customers are done using a relatively big and heavy hammer.

LOCAL_PREF is, in my opinion, the wrong tool to use, but it's what most
of the networks out there seem to have settled on, to the point of having
published BGP communities to use for controlling the LOCAL_PREF setting
on received routes: https://onestep.net/communities/

I've long practiced, and advocated for, the use of MEDs or tweaking origin
codes as a better way to nudge traffic towards customers, peers, customers
of peers, etc., because it still allows as-path to be a factor in nudging
traffic
away.   Prepend inbound 3 times on routes learned from your transit
provider,
but not on your peers, listen to MEDs from your peers, and enable
always-compare-med
and deterministic-med to allow values to be compared across different
pathways.

That way, someone trying to say "don't use this path" can do a simple
triple-prepend,
and see their traffic shift.  In our current world of using LOCAL_PREF,
however, the
poor customer keeps prepending more and more, and never sees their traffic
shift.
In desperation, they prepend the maximum number of times allowed, hoping
that
maybe this will somehow do the trick...not understanding that no matter
what they
do in the prepend realm, so long as their upstreams are using the
LOCAL_PREF
hammer, their prepends will fall on deaf ears.

For the most part--if you think LOCAL_PREF is the right knob to use for
moving
traffic, it probably means you need to go back and rethink your traffic
engineering
approach.   ^_^;

Matt

Re: Let's Focus on Moving Forward Re: V6 still not supported re: 202203261833.AYC

2022-03-31 Thread Matthew Petach

On Wed, Mar 30, 2022 at 12:47 PM Tom Beecher  wrote:

> If the IETF has really been unable to achieve consensus on properly
>> supporting the currently still dominant internet protocol, that is
>> seriously problematic and a huge process failure.
>>
>
> That is not an accurate statement.
>
> The IETF has achieved consensus on this topic. It's explained here by
> Brian Carpenter.
>
> https://mailarchive.ietf.org/arch/msg/int-area/qWaHXBKT8BOx208SbwWILDXyAUA/
>
> He expressly states with many +1s that if something IPv4 related needs to
> get worked on , it will be worked on, but the consensus solution to V4
> address exhaustion was IPng that became IPv6, so that is considered a
> solved problem.
>
> Some folks don't LIKE the solution, as is their right to do. But the
> problem of V4 address exhaustion is NOT the same thing as "I don't like the
> solution that they chose."
>

I suspect people differ in their understanding of the word "consensus":

https://www.merriam-webster.com/dictionary/consensus

"Definition of *consensus*

1a
: general agreement : UNANIMITY
"

Versus the IETF:
https://tools.ietf.org/id/draft-resnick-on-consensus-01.html
(and subsequently https://datatracker.ietf.org/doc/html/rfc7282 )

specifically, this paragraph:

"Any finding of rough consensus needs at some level to be a satisfactory
explanation to the person(s) raising the issue of why their concern is not
going to be dealt with. A good outcome is for the objector to be satisfied
that, although their issue is not being accommodated in the final product,
they understand and accept the outcome. Remember, if the objector feels
that the issue is so essential that it must be attended to, they always
have the option to file an appeal. A technical error is always a valid
basis for an appeal, and a chair or AD has the freedom and the
responsibility to say, "The group did not take this technical issue into
proper account." Simply having a number of people agreeing to dismiss an
objection is not enough."

It would seem that Brian Carpenter's message drifted more towards the
dictionary definition of "consensus" than what the IETF has historically
used to determine "consensus".

Brian seems to have tried to sweep under the carpet a very serious
problem without properly addressing it, by saying (direct quote):
"We shouldn't be fixing problems that IPv6 already fixes,
and shortage of addresses is certainly in that category."

But as anyone who has tried to deploy IPv6-only networks quickly discovers,
at the present time, you can't deploy an IPv6-only network with any
success on the global internet today.  There's too many IPv6-ish networks
out there that haven't fully established their infrastructure to be
reachable
without v4 connectivity also in place.  In order to deploy an IPv6 network
today, you *must* also have IPv4 addresses to work with.  Try to ping
apple.com via v6, or microsoft.com via v6, and see how far you get.
Closer to home, try to ping juniper.com/juniper.net via v6, or nokia.com,
and you'll find there's a whole bunch of assumptions that you've got
some level of working IPv4 in the picture to talk to your hardware and
software vendors.

In short, at the moment, you *can't* deploy IPv6 without also having IPv4
somewhere in your network.  IPv6 hasn't solved the problem of IPv4
address shortage, because you can't functionally deploy IPv6 without
also having at least some IPv4 addresses to act as endpoints.

For the people who already have IPv4 addresses to say "hey, that's
not a problem for us" to everyone who can't get IPv4 addresses is
exactly the problem warned against in section 6 of
https://datatracker.ietf.org/doc/html/rfc7282:

"

6 .  One
hundred people for and five people against might not be rough
consensus

   Section 3 
discussed the idea of consensus being achieved when
   objections had been addressed (that is, properly considered, and
   accommodated if necessary).  Because of this, using rough consensus
   avoids a major pitfall of a straight vote: If there is a minority of
   folks who have a valid technical objection, that objection must be
   dealt with before consensus can be declared. "

The point at which we have parity between IPv4 and IPv6 connectivity is the
point
at which we can start to talk about sunsetting IPv4 and declaring it
historic, and
no longer concern ourselves with address exhaustion.  Until then, so long
as
being able to obtain IPv4 addresses is a mandatory step in being functional
on
the internet, it is unreasonable to say that the address exhaustion problem
is
"solved."

Matt

Re: IPv6 Only

2022-03-31 Thread Matthew Petach

On Thu, Mar 31, 2022 at 5:36 AM Jacques Latour 
wrote:

> Exactly what I was asking, when and how will we collectively turn off the
> lights on IPv4?
>

Working on the World IPv6 Launch {day|week|forever} efforts,
I noticed an interesting pattern of companies that put up IPv6
resources, with all the associated quad-As, and patted themselves
on the back for making themselves available via IPv6; but I couldn't
request those quad-A records via anything but IPv4 transport to their
DNS servers.

I've seen similar behaviour with hardware vendors.  They have great
IPv6 support, their boxes forward and accept IPv6 packets just fine;
but, the deeper you dig, the more you find oddities, like syslog host
destinations that only accept v4 IP addresses, or a requirement for
an IPv4 router ID to be configured.

I don't think we fully grasp just how wide the chasm is between
"we support IPv6" and "we can fully turn off IPv4".

There's a whole lot of "we support IPv6" in the world right now that
comes with lingering IPv4 tendrils that are often under the surface,
or in the darker corners of the config, that just keep working because
most of the IPv6 world is still either dual-stacked, or has a translation
layer that allows the lurking v4 bits to not cause issues.

I don't think we'll be nearly as close to being ready to turn off the
lights
on IPv4 as we think we are, not just because of old customer CPE and
legacy boxes, but because of embedded assumptions deep in software
and firmware stacks.  For example, let's take a relatively modern
enterprise wireless platform:

https://www.arubanetworks.com/techdocs/AOS-CX/10.07/HTML/5200-7852/Content/Chp_ZTP/ztp-sup-aos-cx-10.htm
"

   - ZTP operations are supported over IPv4 connections only. IPv6
   connections are not supported for ZTP operations."

 Sure, the devices pass IPv6 traffic just fine; but you'd better keep your
IPv4
network around so the devices can configure themselves after powering on.

There's a *lot* of code out there that's been carried forward for years,
with dark corners that haven't been touched for a while.  I think we're
going to be stumbling over "can't do that over IPv6 yet" areas for years
and years to come, not because of any willful myopia around the migration
from IPv4 to IPv6, but simply because it's code that doesn't get used very
often, and in dual-stack networks, it just keeps working the few times it
gets exercised.  The only time it would run into a problem is in a pure
IPv6-only network; and how many of those really exist in the world to
flag it as an issue?

And yet, in order to "turn off the lights on IPv4", we're going to have to
root through all those dark corners of code that haven't been touched
in years to update them to work in an IPv6-only world; and that's *really*
pushing the rock uphill, because that's work that isn't going to see any
cost recovery for it at all.  No customer is going to say "I won't buy your
product until you've rooted out every bit of IPv4-only code in your
software".
So, there's really no financial incentive for companies to work towards
getting their software ready for an IPv6-only world.

So--the tl;dr version of my answer to you?
"when" is likely to be "not in any of our lifetimes"--because the "how"
requires completely non-monetizable effort on the part of companies
that have legacy codebases they're carrying forward.

Thanks!

Matt

Re: Cogent ...

2022-03-31 Thread Matthew Petach

On Thu, Mar 31, 2022 at 9:05 AM Paul Timmins  wrote:

> On 3/31/22 11:38, Laura Smith via NANOG wrote:
> > However, perhaps someone would care to elaborate (either on or off-list)
> what the deal is with the requirement to sign NDAs with Cogent before
> they'll discuss things like why they still charge for BGP, or indeed any
> other technical or pricing matters. Seems weird ?!?
>
> Same reason your employer doesn't want employees telling each other
> their salary. Not every similarly situated customer pays the same for
> the same service.
>
>
Having fought that issue[0], I'd like to point out that employees
voluntarily
sharing salary data is federally protected speech in the US, and cannot
be waived through an employment contract:

https://www.nlrb.gov/about-nlrb/rights-we-protect/your-rights/your-rights-to-discuss-wages#:~:text=Under%20the%20National%20Labor%20Relations,for%20mutual%20aid%20or%20protection
.

"Under the National Labor Relations Act (NLRA or the Act), employees have
the right to communicate with other employees at their workplace about
their wages.  Wages are a vital term and condition of employment, and
discussions of wages are often preliminary to organizing or other actions
for mutual aid or protection. "
..."policies that specifically prohibit the discussion of wages are
unlawful."

I understand the parallelism you were aiming for, but
given how many people labour under the mistaken
notion that US companies can forbid you from talking
about your compensation, I felt it prudent to point out
that's actually not a terribly good comparison.   ^_^;

Thanks!

Matt

[0]
https://www.quora.com/My-coworker-asked-about-my-salary-how-should-I-respond/answer/Matthew-Petach

Re: DMARC ViolationAS21299 - 46.42.196.0/24 ASN prepending 255 times

2022-03-26 Thread Matthew Petach

On Fri, Mar 25, 2022 at 6:19 PM Amir Herzberg  wrote:

> Hi Matthew and NANOG,
>
> I don't want to defend prepending 255 times, and can understand filtering
> of extra-prepended-announcements, but I think Matthew may not be correct
> here:
>
>> Anyone that is prepending to do traffic engineering is
>> doing *differential* prepending; that is, a longer number
>> of prepends along one path, with a shorter set of prepends
>> along a different path.
>>
>> So, dropping the inbound announcement with 255 prepends
>> merely means your router will look for the advertisement with
>> a shorter number of prepends on it.
>>
>
> Right. But let's consider the (typical) case where someone is prepending
> for traffic engineering. Now, if you're not very near to the origin of the
> prepended announcement, and still received it (and not the shorter
> alternative), then it is quite likely that you received it since the
> alternate path failed - and the backup path was announced, instead (by
> upstreams of the origin). So your router is quite likely not to receive the
> shorter announcement.
>
>
Note that as-path prepending only matters as a *differential* value.

Choosing between 5 and 8 prepends, for example, gives you 3 levels of
differentiation between the paths.

Prepending 255 times is equivalent to setting MAXCOST in OSPF; it's an
overload setting, saying "don't freaking use this path *EVER*".

If you want to traffic engineer, you set your less preferred path with
say 5 prepends, and your more preferred path with 3 prepends, and
your really really preferred path with 1 prepend.

If you're setting 255 prepends on a path, that's not traffic engineering,
that's equivalent to setting the overload bit; it's the maximum metric
equivalent in a link-state routing protocol.  It's clearly a DO-NOT-USE
indicator, in the same category as community 0xFF04 or
65535:0

In short--if someone sends me 255 prepends, it's going to
be treated the same way as LSInfinity in OSPF.

Matt



> After all, if your router received both short and long announcements (from
> same relationship, e.g., both from providers), then your router would
> probably select the shorter path anyway, without need to filter out the
> long one, right?
>
> So, filtering announcements with many prepends may cause you to lose
> connectivity to these networks. Of course, you may not mind losing
> connectivity to Kazakhstan :) ...
>
> best, Amir
>
>>
>>
>> --
> Amir Herzberg
>
> Comcast professor of Security Innovations, Computer Science and
> Engineering, University of Connecticut
> Homepage: https://sites.google.com/site/amirherzberg/home
> `Applied Introduction to Cryptography' textbook and lectures:
>  https://sites.google.com/site/amirherzberg/applied-crypto-textbook
> <https://sites.google.com/site/amirherzberg/applied-crypto-textbook>
>
>
>
>
> On Fri, Mar 25, 2022 at 8:19 PM Matthew Petach 
> wrote:
>
>>
>>
>> On Fri, Mar 25, 2022 at 2:59 PM Adam Thompson 
>> wrote:
>>
>>> Tom, how exactly does someone “ride the 0/0” train in the DFZ?
>>>
>>
>> It's not so much "ride the 0/0 train" as much as it is
>> "treat excessive prepends as network-unreachable"
>>
>> Think of prepends beyond say 10 prepends as a way
>> to signal "infinite" distance--essentially, "unreachable"
>> for that prefix along that path.
>>
>> Anyone that is prepending to do traffic engineering is
>> doing *differential* prepending; that is, a longer number
>> of prepends along one path, with a shorter set of prepends
>> along a different path.
>>
>> So, dropping the inbound announcement with 255 prepends
>> merely means your router will look for the advertisement with
>> a shorter number of prepends on it.
>>
>> If you're only announcing one path for your prefix, and it is
>> prepended 255 times, you're fundamentally not understanding
>> how BGP works, and the only way to get a clue-by-four might
>> be to discover you've made your prefix invisible to a significant
>> portion of the internet.
>>
>>
>>>
>>>
>>> I’m connected to both commercial internet and NREN, and
>>> unfortunately-long paths are not uncommon in this scenario, in order to do
>>> traffic steering.  If there’s another solution that affects global
>>> *inbound* traffic distributions, I’d love to hear about it (and so
>>> would a lot of my peers in edu).
>>>
>>>
>>>
>>> If there were a usable way to “dump” the excessively-long path only as
>>> long as a better path was already known by

Re: DMARC ViolationAS21299 - 46.42.196.0/24 ASN prepending 255 times

2022-03-25 Thread Matthew Petach

On Fri, Mar 25, 2022 at 2:59 PM Adam Thompson 
wrote:

> Tom, how exactly does someone “ride the 0/0” train in the DFZ?
>

It's not so much "ride the 0/0 train" as much as it is
"treat excessive prepends as network-unreachable"

Think of prepends beyond say 10 prepends as a way
to signal "infinite" distance--essentially, "unreachable"
for that prefix along that path.

Anyone that is prepending to do traffic engineering is
doing *differential* prepending; that is, a longer number
of prepends along one path, with a shorter set of prepends
along a different path.

So, dropping the inbound announcement with 255 prepends
merely means your router will look for the advertisement with
a shorter number of prepends on it.

If you're only announcing one path for your prefix, and it is
prepended 255 times, you're fundamentally not understanding
how BGP works, and the only way to get a clue-by-four might
be to discover you've made your prefix invisible to a significant
portion of the internet.

>
>
> I’m connected to both commercial internet and NREN, and unfortunately-long
> paths are not uncommon in this scenario, in order to do traffic steering.
> If there’s another solution that affects global *inbound* traffic
> distributions, I’d love to hear about it (and so would a lot of my peers in
> edu).
>
>
>
> If there were a usable way to “dump” the excessively-long path only as
> long as a better path was already known by at least one edge router, that
> might be workable, but you’d have to keep track of it somewhere to
> reinstall it if the primary route went away… at which point you may as well
> have not dropped it in the first place.
>
>
You dump the excessively-long path based on the assumption that
the only reason for a long set of prepends out one path is to shift traffic
away from that path to one that you're advertising out with a *shorter*
set of prepends.

The router doesn't need to 'look' for or 'keep track' of the different
path; the human makes the decision that any sane BGP speaker
would only prepend 255 times on a path if there was a shorter
as-path advertisement they wanted people to use instead.

So, drop the excessively long prepended path, and make use
of the 'should be in the table somewhere' advertisement of the
prefix with fewer prepends.

Easy-peasy.

>
>
> -Adam
>
>
>

Re: "Permanent" DST

2022-03-15 Thread Matthew Petach

Please provide a link documenting this claim.

I have been reviewing the actions listed on congress.gov, and this is not
an action listed as having taken place.

https://www.congress.gov/bill/117th-congress/senate-bill/623/all-actions?overview=closed#tabs

The last action shown for this bill was taken on March 9th, 2021, more than
a year ago.

Thanks!

Matt

On Tue, Mar 15, 2022, 12:14 Jay R. Ashworth  wrote:

> In a unanimous vote today, the US Senate approved a bill which would
>
> 1) Cancel DST permanently, and
> 2) Move every square inch of US territory 15 degrees to the east.
>
> My opinion of this ought to be obvious from my rhetoric.  Hopefully, it
> will
> fail, because it's likely to be the end of rational time worldwide, and
> even
> if you do log in UTC, it will still make your life difficult.
>
> I'm poleaxed; I can't even decide which grounds to scream about this on...
>
> Hopefully, the House or the White House will be more coherent in their
> decision on this engineering construct.
>
> Cheers,
> -- jra
>
> --
> Jay R. Ashworth  Baylink
> j...@baylink.com
> Designer The Things I Think   RFC
> 2100
> Ashworth & Associates   http://www.bcp38.info  2000 Land
> Rover DII
> St Petersburg FL USA  BCP38: Ask For It By Name!   +1 727 647
> 1274
>
>

Re: Cogent cutting links to Russia?

2022-03-04 Thread Matthew Petach

On Fri, Mar 4, 2022 at 11:00 PM Masataka Ohta <
mo...@necom830.hpcl.titech.ac.jp> wrote:

> Sean Donelan wrote:
>
> > The Russia sanctions are different (see a lawyer),
>
> It seems to me that, according to
>
>
> https://home.treasury.gov/system/files/126/ukraine_overview_of_sanctions.pdf
>
> sanctions to prohibit "the exportation or importation of any
> goods, services, or technology" is "to or from the Crimea
> region of Ukraine", not Russia.
>

That document is from 2016.

I suspect there's a more recent one related to the
current situation.  ^_^;

Matt

Re: Cogent cutting links to Russia?

2022-03-04 Thread Matthew Petach

On Fri, Mar 4, 2022 at 12:55 PM Martin Hannigan  wrote:

>
> I would argue they don't have much of a choice:
>
> "The economic sanctions put in place as a result of the invasion and the
> increasingly uncertain security situation make it impossible for Cogent to
> continue to provide you with service."
>
> I would expect to see others follow suit  if that is the case.
>

That's an interesting slope to slide along...

I fully understand ISPs disconnecting customers for non-payment; we've
all had to do that at one point or another in our careers, I'm sure.
However, that's generally done *after* the customer has demonstrated
an inability or unwillingness to pay their bills.

This doesn't seem to indicate that any existing invoices have gone
unpaid past their due date, but simply that there is *concern* that a
future bill might go unpaid due to the economic sanctions.

I'm not sure that's a good precedent for a service provider to create;
"we may terminate your service at any point if we suspect that at an
unspecified time in the future, you may become unable to pay future
invoices."

Shades of Minority Report.  We'll imprison you today for a crime we
suspect you will commit in the future.   ^_^;

If and when bills go unpaid, I fully support turning off customers.
I worry about the precedent of disconnecting based on suspicions
of what might happen in the future, however.

Matt

Re: Starlink terminals deployed in Ukraine

2022-03-03 Thread Matthew Petach

On Thu, Mar 3, 2022, 07:17 Dorn Hetzel  wrote:

> One hopes there is some respectable, perhaps even paranoid, encryption on
> his control functions.
>
>>
Talk about timely!  We just had a very nice presentation about this in
Austin:

https://storage.googleapis.com/site-media-prod/meetings/NANOG84/2479/20220215_Coggin_Pwned_In_Space_v1.pdf

https://youtu.be/fCVs3VKUyJ8

I'd link to the abstract itself if I could, but it looks like the mobile
view of the nanog site won't let me do that.  ^_^;

Matt

Re: Starlink terminals deployed in Ukraine

2022-03-01 Thread Matthew Petach

On Tue, Mar 1, 2022 at 11:59 AM Scott McGrath  wrote:

> Starlink however forgets that Russia does have anti satellite weapons and
> they probably will not hesitate to use them which will make low earth orbit
> a very dangerous place when Russia starts blowing up the Starlink birds.
> I applaud the humanitarian aspect of providing Starlink service,
> unfortunately there are geopolitical realities like access to space which
> is likely to be negatively impacted if and when Russia starts shooting down
> these birds.Fortunately if they start shooting down the birds the
> debris will burn up in a year or so unlike geosync orbit where it would
> stay forever.
>

Anti-satellite weapons hearken from the NASA-era of satellite launches,
which cost hundreds of millions of dollars, were planned years if not
decades
in advance, and would take an equivalent amount of time and money to
replace if shot down.

Note SpaceX's response when 40 out of 49 satellites were fried shortly
after
launch due to recent solar activity:

https://www.space.com/spacex-starlink-satellites-lost-geomagnetic-storm

Pretty much just a "ho hum, s**t happens, we'll make sure they burn up
safely and don't hit anything on the way down."

And then they launched another 46 birds three weeks later:
https://www.kennedyspacecenter.com/launches-and-events/events-calendar/2022/february/rocket-launch-spacex-falcon-9-starlink-4-8#:~:text=Event%20Details-,Rocket%20Launch%3A%20February%2021%2C%202022%209%3A44%20AM%20EST,Falcon%209%20Starlink%204%2D8

and a week after that, launched another 50 birds:
https://www.space.com/spacex-50-starlink-satellites-launch-february-2022

Sure, Russia could start shooting them down.
But at the rate SpaceX can build and launch them, in that war of
attrition, I'd put my money on SpaceX, not Russia--and it would
let everyone in the world get a very detailed map of exactly what
the capabilities and limitations of Russia's anti-satelite weaponry
are as they fired it off dozens if not hundreds of times in a relatively
short time period.

I think people are just now waking up to how radically SpaceX has
changed access to space.   ^_^;

Matt

Re: Starlink terminals deployed in Ukraine

2022-03-01 Thread Matthew Petach

On Tue, Mar 1, 2022 at 10:38 AM Crist Clark  wrote:

> So they’re going to offer the service to anyone in a denied area for free
> somehow? How do you send someone a bill or how do they pay it if you can’t
> do business in the country?
>

It's not like Google is billing anyone for using 8.8.8.8 et al.
[for those who immediately respond "this is SpaceX, not Google,
remember, Google already put a billion dollars into the company
to purchase 10% ownership of it; contributing another billion to
fund service to Ukraine wouldn't be beyond their means by any
stretch.]

Besides, it could be a great "free now, but 6 months after an
armistice is signed, you can cancel the service and return the
dish, or start paying our regular monthly service fee" type
situation.

I mean, if starlink offered you free service for N months, and
then at the end, you had to choose to return the dish or start
paying the monthly fee, how likely are you to give it up once
you've gotten used to using it every day?

If we really want to get creative, there's always the carbon offsets
model for industry.  We could create incentive structures for
global companies to buy "democracy credits" through donations
like that, which would offset a similar amount of latitude in doing
business within authoritarian regions.  That way, if you donate
a billion dollars worth of service to support freedom and democracy
in Ukraine, we'll collectively look the other way if you use slave
Uyghur labour to assemble a billion dollars worth of CPE.

In short--there's lots of ways this could work out, beyond a simple
"let's just give it away for free forever" model.   ^_^

Matt

Re: Ukraine request yikes

2022-03-01 Thread Matthew Petach

On Tue, Mar 1, 2022 at 12:19 AM George Herbert 
wrote:

> Posted by Bill Woodcock on Twitter…
> https://twitter.com/woodyatpch/status/1498472865301098500?s=21
>
> https://pastebin.com/DLbmYahS
>
> Ukraine (I think I read as) want ICANN to turn root nameservers off,
> revoke address delegations, and turn off TLDs for Russia.
>
> Seems… instability creating…
>
> -george
>

Information sharing should increase during wartime, not decrease.

Restricting information is more often the playbook of authoritarian
regimes,
and not something we should generally support.

Besides, GhostWriter is based out of Belarus, not Russia proper.  ^_^;
https://www.wired.com/story/ghostwriter-hackers-belarus-russia-misinformationo/

Matt

Re: BANDWIDTH and VONAGE lose FCC rules exemption for STIR/SHAKEN

2022-02-24 Thread Matthew Petach

What made my otherwise-largely-quiescent phone go
berserk was joining AARP.

Went from weeks between random telemarketing call to
now getting sometimes more than 100 calls before
lunchtime.  Today, I couldn't hang up on one person
offering me Medicare benefits fast enough before
another was already beeping on call waiting.  :/

Moral of the story?

If you retire and join AARP, put your most hated
enemy's phone number down instead of yours.  :(

Matt


On Thu, Feb 24, 2022 at 7:28 PM Tom Mitchell 
wrote:

> I've seen an uptick, but nothing too dramatic.  Maybe 4-5 junk calls a day
> - mostly afternoon.
>
> -- Tom
>
>
> On Sun, Feb 20, 2022 at 9:57 AM Josh Luthman 
> wrote:
>
>> Mine exploded since the requirement date.  Some mornings I get a dozen
>> before lunch.
>>
>> On Fri, Feb 18, 2022 at 2:33 PM Michael Thomas  wrote:
>>
>>>
>>> On 2/17/22 11:58 AM, Sean Donelan wrote:
>>> >
>>> >
>>> https://www.fcc.gov/document/fcc-finds-two-providers-failed-fully-implement-stirshaken-0
>>> >
>>> >
>>> > The Federal Communications Commission today took action to ensure that
>>> > voice service providers meet their commitments and obligations to
>>> > implement STIR/SHAKEN standards to combat spoofed robocall scams.
>>> > Specifically, voice service providers Bandwidth and Vonage lost a
>>> > partial exemption from STIR/SHAKEN because they failed to meet
>>> > STIR/SHAKEN implementation commitments and have been referred to the
>>> > FCC’s Enforcement Bureau for further investigation.
>>>
>>>
>>> So for probably a year or so before the Stir/Shaken mandate came, I have
>>> been seeing a lot less phone spam. I don't know if that's typical but it
>>> was quite noticeable for me. What that tells me is that providers likely
>>> started clamping down on their shady customers well ahead of the mandate
>>> which says that regulatory fiat would have been sufficient too. But that
>>> hinges on whether my situation is typical though.
>>>
>>> Mike
>>>
>>>

Re: New minimum speed for US broadband connections

2022-02-16 Thread Matthew Petach

On Wed, Feb 16, 2022 at 1:16 PM Josh Luthman 
wrote:

> I'll once again please ask for specific examples as I continue to see the
> generic "it isn't in some parts of San Jose".
>

You want a specific example?

Friend of mine asked me to help them get better Internet connectivity a few
weeks ago.

They live here:
https://www.google.com/maps/place/Meridian+Woods+Condos/@37.3200394,-121.9792261,17.47z/data=!4m5!3m4!1s0x808fca909a8f5605:0x399cdd468d99300c!8m2!3d37.3190694!4d-121.9818295

Just off of I-280 in the heart of San Jose.

I dug and dug, and called different companies.
The only service they can get there is the 768K DSL service they already
have with AT

Go ahead.  Try it for yourself.

See what service you can order to those condos.

Heart of Silicon Valley.

Worse connectivity than many rural areas.   :(

Matt

Re: What do you think about the "cloudification" of mobile?

2022-01-25 Thread Matthew Petach

On Tue, Jan 25, 2022 at 10:11 AM Michael Thomas  wrote:

>
> [...]
>
> Since everybody has their own wifi it seems that federating all of them
> for pretty good coverage by a provider and charging a nominal fee to
> manage it would suit a lot of people needs. It doesn't need expensive
> spectrum and the real estate is "free". Basically a federation of
> "guestnets".
>
> Mike
>

Which is pretty much what Xfinity is already offering
to their subscribers; use your xfinity login to get onto
the wifi access points of other xfinity users all around
the country, relatively seamlessly.

I'm sure other networks that provide their own CPE are
likely to follow suit as well.

Matt

Re: Redeploying most of 127/8, 0/8, 240/4 and *.0 as unicast

2021-11-21 Thread Matthew Petach

On Sat, Nov 20, 2021 at 6:27 PM Joe Maimon  wrote:

> Tom Beecher wrote:
> [...]
> >
> > IPv6 isn't perfect. That's not an excuse to ignore it and invest the
> > limited resources we have into Yet Another IPv4 Zombification Effort.
> >
> As noted earlier, False Dilemma
>
> Even worse, your thinking presupposes a finite amount of people-effort
> resources that must be properly managed by those superior in some
> fashion with more correct thinking.
>

This is absolutely true in the corporate world.

You have a finite number of people working a finite
number of hours each week on tasks that must be
prioritized based on business needs.

You can't magically make more people appear out
of thin air without spending more money, which is
generally against the needs of the business, and
you can't generally make more working hours appear
in the week without either magic or violating workers
rights.

Thus, you have a finite amount of people-effort resources
which must be managed by those higher up in the corporate
structure.

As an old boss of mine once said... "You sum it up so well."

Matt

Re: DNS hijack?

2021-11-12 Thread Matthew Petach

On Fri, Nov 12, 2021 at 5:55 AM William Herrin  wrote:

> On Thu, Nov 11, 2021 at 6:36 PM Jeff Shultz 
> wrote:
> >
> >
> > Yeah, apparently when a domain expires, a lot of DNS queries to domains
> in that domain's DNS server... get redirected to a Network Solutions "this
> is expired" website at that IP.
> > Even though those domains are perfectly legit and paid up. Or so it was
> explained to me and how it appeared.
>
> Hi Jeff,
>
> Do you mean that there's a delay between when you're recorded as
> having paid up and when everything is correct throughout the DNS
> system? Yes, there is. Your domain expired, you corrected the problem,
> but then there was an unexpected (by you) delay before the interloping
> name resolution was gone?
>
> If you meant something else, I'd like to hear a better description of
> the problem. If not... well of course: that's how the DNS works.
> There's propagation delay imposed by TTLs and refresh intervals before
> old data is discarded. There are a handful of scenarios (e.g.
> old-school browser pinning) where stale data can persist for months.
> Don't let the domain expire before you renew it. Really don't.
>

I suspect it's more a case of

domain foo.com provides DNS service for several other domains,
including bar.com.

bar.com is fully paid up.

foo.com doesn't get paid up on time; expires, but is quickly
re-claimed and paid up again.

queries for bar.com suddenly show up as "this domain is
available" due to foo.com (which provides DNS for bar.com)
having briefly gone into the expired state.  Users of bar.com
are (rightly) confused, as bar.com was never in a jeopardy
state.

We'll see if Jeff confirms my suspicion of what happened
in this case.   ^_^;

Matt

Re: DNS pulling BGP routes?

2021-10-18 Thread Matthew Petach

On Mon, Oct 18, 2021 at 1:17 PM William Herrin  wrote:

> On Mon, Oct 18, 2021 at 11:47 AM Matthew Petach 
> wrote:
> > On Mon, Oct 18, 2021 at 11:16 AM William Herrin  wrote:
> >> On Mon, Oct 18, 2021 at 10:30 AM Baldur Norddahl
> >>  wrote:
> >> > Around here there are certain expectations if you sell a product
> called IP Transit and other expectations if you call the product paid
> peering. The latter is not providing the whole internet and is cheaper.
> >>
> >> The problem with paid peering is that it creates a conflict of
> >> interest which corruptly influences the company's behavior. Two
> >> customers are paying you in full for a service but if one elects not
> >> to pay you will also deny or degrade the service to the other one who
> >> has, in fact, paid you.
> >
> >
> > The phrase "paying you in full" is the stumbling point with your
> > claim.
> >
> > As Baldur noted, "paid peer [...] is not providing the whole
> > internet and is cheaper."
>
> Since peering customers can only reach transit customers, it follows
> that one of the customers in the equation is a fully-paid transit
> customer. That fully paid customer's service is degraded or denied
> unless the peering customer also pays. Hence the conflict of interest.
>

I'm sorry.  :(

I'm feeling particularly dense this morning, so I'm going to work through
the two cases very slowly to make sure I understand.

Customer A is full transit paying customer.
In case 1, Customer B is a full transit paying customer also.

Customer A announces their prefixes to ISP; as a transit customer,
ISP promises to announce those prefixes to everyone they have a
BGP relationship with, including customer B.  Likewise, ISP provides
a full BGP table, including default if requested, to Customer A, ensuring
Customer A can reach Customer B, and Customer B can reach Customer A.

in case 2, Customer B is a paid peering customer.

Customer A announces their prefixes to ISP; as a transit customer,
the ISP promises to announce those prefixes to everyone they have
a BGP relationship with, including Customer B.  Likewise, ISP provides
a full BGP table, including default if requested, to Customer A, ensuring
Customer A can reach Customer B, and Customer B can reach Customer A.

I'm not seeing how Customer B's status as paid peer versus transit
customer changes either the set of prefixes Customer A sees, or the
spread of Customer A's prefixes to the rest of the Internet.

In short--the amount Customer B is paying or not paying, does not
change the view of prefixes that Customer A sees, nor does it change
the propagation scope of Customer A's prefixes.  As neither of those
two things change, I'm completely failing to see how Customer A's
service is being degraded or denied based on Customer B's choices.

Can you explain what it is I'm missing here?   ^_^;

Regards,
> Bill Herrin
>

Thanks!

Matt

Re: DNS pulling BGP routes?

2021-10-18 Thread Matthew Petach

On Mon, Oct 18, 2021 at 11:16 AM William Herrin  wrote:

> On Mon, Oct 18, 2021 at 10:30 AM Baldur Norddahl
>  wrote:
> > Around here there are certain expectations if you sell a product called
> IP Transit and other expectations if you call the product paid peering. The
> latter is not providing the whole internet and is cheaper.
>
> The problem with paid peering is that it creates a conflict of
> interest which corruptly influences the company's behavior. Two
> customers are paying you in full for a service but if one elects not
> to pay you will also deny or degrade the service to the other one who
> has, in fact, paid you.
>

The phrase "paying you in full" is the stumbling point with your
claim.

As Baldur noted, "paid peer [...] is not providing the whole
internet and is cheaper."

If the two customers are "paying you in full", then they're
paying you for transit, and as such, they get a copy of the
full tables, regardless of how you learn those routes,
whether through a paid relationship or a settlement free
relationship.

If the two customers are *not* paying full price, but are
instead paying the reduced price for "paid peering",
then they each recognize that the set of prefixes they
are receiving, and the spread of their prefixes in return
are inherently limited, *and will change over time as
the customer relationships on each side change."

Nobody buying "paid peering" expects the list of prefixes
sent and received across those sessions to remain constant
forever.  That would imply no new customers are ever added,
and would imply no customers ever leave, which is clearly
unreasonable in the real world.

If you, as the customer paying for paid peering, see the
list of prefixes decreasing over time, when the contract
comes up for renewal, you are likely to argue for a lower
price, or may decide it's no longer worth it, and decide to
not renew the relationship.

On the other hand, if you, as the provider, are increasing
the number of prefixes being seen across those paid peerings
at a substantial rate, when the next renewal cycle comes up,
you may decide the price for paid peering should go up, because
you're providing more value across those sessions.

Each side evaluates the then-present set of prefixes being
exchanged when the contract comes up for renewal, to
decide if it's still worth it or not.

But if you're "paying in full" for IP transit, then the sessions
should include as much of the full BGP table as possible,
potentially including a default route, and the promise of that
session is to make your prefixes as visible to the entire rest
of the Internet as possible.

(This is, as a small aside, why I don't think Cogent should be
allowed to label their product "IP transit" so long as they are
willfully refusing to propagate their customer's prefixes to
*all* of the rest of the Internet.  So long as they are choosing
to cherry-pick out certain networks that they will *not* propagate
their customers routes to, they are *not* providing true IP transit,
and should not label it as such.)

>
> Regards,
> Bill Herrin
>

Thanks!

Matt

Re: DNS pulling BGP routes?

2021-10-18 Thread Matthew Petach

On Sun, Oct 17, 2021 at 4:54 AM Masataka Ohta <
mo...@necom830.hpcl.titech.ac.jp> wrote:

> Matthew Petach wrote:
>
> > I'd like to take a moment to point out the other problem with this
> > sentence, which is "antitrust agencies".
> >
> > One of the key aspects to both CDN providers and transit
> > providers is they tend to be multi-national organizations with
> > infrastructure in multiple countries on multiple continents.
>
> Your theory that multi-national entities can not be
> targets of anti-trust agencies of individual countries
> and can enjoy world wide oligopoly is totally against
> the reality.
>

*facepalm*

No, the point I was making wasn't that they can't be the
target of antitrust agencies, the point was that there's so
many conflicting jurisdictions that consistent enforcement
in a coordinated fashion is impossible.  We can't even get
countries to agree on what a copyright or a trademark means,
or even what privacy rights a person should have.

I know one content distribution company that was originally
thinking of putting a site in country X; however, after taking a
closer look at the laws in country X, decided instead to put the
site in a nearby country with more favourable laws and to
interconnect with the network providers just outside country X,
thus putting them outside the reach of those laws.

It's really, *really* hard to "regulate" global infrastructure because
it crosses over/under/through so many different jurisdictions; if
one country decides to put considerably stronger restrictions
in place, the reaction by and large is to 'route around the damage'
so to speak.

The lack of success from Brasil's efforts are a good indication
of just how successful per-country regulation of internet providers
tends to be:
https://www.networkworld.com/article/2175352/brazil-to-drop-requirement-that-internet-firms-store-data-locally.html

The GDPR is probably the most successful effort at reining
in global internet companies in recent years, and even there,
when companies ignore it, the resulting fines are a small slap
on the wrist at best, hardly causing them to change their
behaviours:
https://secureprivacy.ai/blog/gdpr-the-6-biggest-fines-enforced-by-regulators-so-far

Even the $5 billion fine Facebook paid to the FTC after the
Cambridge Analytica was really only a $106M fine, with an
extra $4.9B thrown in to make the personal lawsuit go away:
https://www.politico.com/news/2021/09/21/facebook-paid-billions-extra-to-the-ftc-to-spare-zuckerberg-in-data-suit-shareholders-allege-513456

When companies can afford to throw an extra 50x the money
at a regulatory agency to make a problem go away, it's pretty
clear that thinking that regulatory agencies are going to have
enough teeth to fundamentally change the way of life of those
businesses is optimistic at best.

Looking at the top 15 antitrust cases in the US, you can see
how in many cases, the antitrust action was minimally effective
in the long term, as the companies that were split up often ended
up rejoining again, years down the line:
https://stacker.com/stories/3604/15-companies-us-government-tried-break-monopolies

>
> Masataka Ohta
>

Matt

Re: DNS pulling BGP routes?

2021-10-16 Thread Matthew Petach

On Wed, Oct 13, 2021 at 6:26 AM Masataka Ohta <
mo...@necom830.hpcl.titech.ac.jp> wrote:

> Matthew Petach wrote:
>
> >>> With an anycast setup using the same IP addresses in every
> >>> location, returning SERVFAIL doesn't have the same effect,
> >>> however, because failing over from anycast address 1 to
> >>> anycast address 2 is likely to be routed to the same pop
> >>> location, where the same result will occur.
> >>
> >> That's why that is a bad idea. Alternative name servers with
> >> different IP addresses should be provided at separate locations.
>
> > Sure.  But that doesn't do anything to help prevent the
> > type of outage that hit Facebook, which was the point I
> > was trying to make in my response. Facebook did use > different IP
> addresses, and it didn't matter, because the
>  > underlying health of the network is what was at issue,
>  > not the health of the nameservers.
>
> A possible solution is to force unbundling of CDN providers and
> transit providers by antitrust agencies.
>

Other people have already spoken to the misunderstanding or
misuse of the terms "CDN provider" and "transit provider" in this
case.

I'd like to take a moment to point out the other problem with this
sentence, which is "antitrust agencies".

One of the key aspects to both CDN providers and transit
providers is they tend to be multi-national organizations with
infrastructure in multiple countries on multiple continents.

A CDN provider that only exists in one city is a hosting
company, not a CDN.

A transit provider that only provides network connectivity
in one city, or one state, isn't a very valuable transit
provider, since the implicit (and sometimes explicit) promise
the transit network is making to their customers is that they
will carry their IP traffic to the rest of the world, ensuring as
best as they can that their prefixes are visible to others,
and that their packets are carried to other networks, wherever
they may be.

You won't be terribly successful as a transit provider if your
business model is to "carry traffic for your customers all the
way to the edges of the city", or "carry your traffic anywhere
within the country it needs to go, but discard it if it needs to
go outside the country."

So, given that both our CDN provider and our transit network
provider operate in more than one country, what "antitrust
agency" would have jurisdiction over the CDN provider and
the transit provider that could force unbundling of their
services?

What if every country the CDN provider and the transit
provider operate in has a different definition of what it
means to "unbundle" the services?

Then, CDN providers can't pursue efficiency only to kill
> fundamental redundancy of DNS.
>
> For network neutrality, backbone providers *MUST* be neutral
> for contents they carry.
>

Nothing at all requires backbone providers to be neutral.

Backbone networks are free to restrict what traffic or content
passes across their networks.  Indeed, many backbone providers
include in their terms of service lists of traffic that they reserve the
right to block or discard.  Most of the time, those clauses are focused
on traffic which may be injurious to the backbone network or the systems
that support it; but even DDoS traffic which isn't itself injurious to the
backbone, but does impact other customers, may be dropped at the
backbone providers' discretion.

We should recognize the fundamental difference between
> independent, thus neutral, backbone providers and
> CDN providers with anti-neutral backbone of their own.

Others have, I think, already addressed more directly their
fundamental disagreement with that statement.   ^_^;

> Masataka Ohta
>
>
Thanks!   :)

Matt

Re: S.Korea broadband firm sues Netflix after traffic surge

2021-10-12 Thread Matthew Petach

On Tue, Oct 12, 2021 at 2:01 PM Tom Beecher  wrote:

> I think it would be absolutely *stunning* for content providers
>> to turn the model on its head; use a bittorrent like model for
>> caching and serving content out of subscribers homes at
>> recalcitrant ISPs, so that data doesn't come from outside,
>> it comes out of the mesh within the eyeball network, with
>> no clear place for the ISP to stick a $$$ bill to.
>>
>
> I'm familiar with some work and ideas that have gone into such a thing,
> and I'm personally very much against it for non-technical reasons.
>
> Given how far the law lags behind technology, the last thing anyone should
> be ok with is a 3rd party storing bits on ANYTHING in their house, or
> transmitting those bits from a network connection that is registered to
> them.
>

*chortle*

So, I take it you steadfastly block *all* cookies from being stored
or transmitted from your browser at home?

Oh, wait.  You meant it's OK to let some third parties
store and transmit bits from your devices, but only
the ones you like and support, and as long as they're
small bits, and you're sure there's nothing harmful or
illegal in them.

So, that means you check each cookie to make sure
there's nothing in them that could be illegal?

You sure someone hasn't tucked something like
the DeCSS algorithm, or the RSA algorithm into
a cookie in your browser, like this?

https://commons.wikimedia.org/wiki/File:Munitions_T-shirt_(front).jpg
https://www.cafepress.com/+,954530397?utm_medium=cpc_source=pla-google_campaign=7979505756-d-c_content=83814261273-adid-395151690662_term=pla-1396845372217-pid-954530397=Cj0KCQjw5JSLBhCxARIsAHgO2SeM10JbFgeus96hEedn0d0m2Kkz6Z91-frlEIUh-3ZD2w89j8EUmCsaAvnAEALw_wcB

The fact of the matter is, every one of us allows
third parties to store data on all our devices, all
the time, and send it back out on the network,
completely unsupervised by us, even though
it could contain data which is illegal to cross
certain arbitrary political boundaries.

I understand where you're coming from, I really
do.

But I don't think people stop and think about just
how completely that ship has sailed, from a legal
standpoint.  You could have been asked by a random
website to store code which is illegal to export in a
cookie which is then offered back up to any other
website in whatever jurisdiction around the globe
that asks for it, and you'll be completely unaware
of it, because we've all gotten past the point of "ask
me about every cookie" being a workable setting on
any of our devices.

Go ahead.  Turn off all cookie support on all your devices
for 24 hours.  Don't let any of that third party data in or out
of your home during that time.

Let me know how well that turns out.

Bonus points if you enforce it on your family/spouse/SO/partner
at the same time, and they're still talking to you at the end of the
24 hours.  ;-P

Matt

Re: DNS pulling BGP routes?

2021-10-12 Thread Matthew Petach

On Tue, Oct 12, 2021 at 8:41 AM Masataka Ohta <
mo...@necom830.hpcl.titech.ac.jp> wrote:

> Matthew Petach wrote:
>
> > With an anycast setup using the same IP addresses in every
> > location, returning SERVFAIL doesn't have the same effect,
> > however, because failing over from anycast address 1 to
> > anycast address 2 is likely to be routed to the same pop
> > location, where the same result will occur.
>
> That's why that is a bad idea. Alternative name servers with
> different IP addresses should be provided at separate locations.
>
> Masataka Ohta
>
>
Sure.  But that doesn't do anything to help prevent the
type of outage that hit Facebook, which was the point I
was trying to make in my response.  Facebook did use
different IP addresses, and it didn't matter, because the
underlying health of the network is what was at issue,
not the health of the nameservers.

I agree with you--different IP addresses should be
used in different geographic locations, even with
anycast setups.

But people need to also recognize that's not a
panacea that solves everything, and that it wouldn't
have changed the nature of the outage last week.

Thanks!  :)

Matt

Re: S.Korea broadband firm sues Netflix after traffic surge

2021-10-12 Thread Matthew Petach

On Tue, Oct 12, 2021 at 8:16 AM Jared Brown  wrote:

> Mark Tinka wrote:

[...]

>
> > But I doubt that
> > will work, unless someone can think up a clever way to modify BitTorrent
> > to suit today's network architectures.
>   Unless network topology is somehow exposed, this isn't possible. All
> anybody can do is use latency, IP and ASN information as a proxy.
>
>   Nothing is stopping a BitTorrent client from being selective about its
> peers. The current peer selection algorithm optimizes for throughput, not
> adjecency or topology.
>

Thank you to everyone who pointed out this has
already been tried in the past--I wasn't aware of
it, but it stands to reason.  By the time I think of
something as a good idea, there's a high probability
it's already been done somewhere.  ;)

In terms of exposing network topology, remember the
clients get their information on what chunks to fetch
from whom from the tracker.  As each client connects
to the tracker to report what chunks it has, the tracker
can build a mapping of client IP to ASN, coupled with
latency.  For added fanciness, a traceroute towards the
client's public IP can be performed, and then clusters
can be mapped of clients with the highest numbers
of common elements in the traceroute path back,
which would give you a measure of network topological
"closeness".

That is, if the traceroutes to client A and client B are the
same for 12 out of 15 hops, but the traceroutes to client A
and client C are only the same for 8 out of 15 hops, we have
a good hint that client A and client B are probably topologically
closer than client A and client C, and therefore when client A makes
a request for chunks of movie 1, if both client B and client C have
relevant chunks, we would provide client A with client B's information
preferentially over client C's information.

That way, the tracker can help cluster data transfers in roughly
topological "closeness".  Over time, you can build up a more and
more accurate topological map as you collect path information
from each tracker back to each client.  For added points, since
we're talking about subscription-based content delivery, associate
each client's IP address(es) with the subscriber login information
and now you have a mapping of where that subscriber watches
content from, over time.  Knowing their viewing history would
allow you to get an idea of what they're likely to watch next, and
where in the network they're likely to watch it, and you can nudge
your pre-seeding of chunks of the next-most-likely-to-be-watched
content to other clients topologically near to where the subscriber
is most likely going to be connected when they want to watch that.

...but perhaps I'm getting a bit too far into the "creepy" factor at
this point.   ^_^;

> - Jared
>
>
Thanks!

Matt

Re: S.Korea broadband firm sues Netflix after traffic surge

2021-10-11 Thread Matthew Petach

On Mon, Oct 11, 2021 at 10:09 AM Michael Thomas  wrote:

>
> On 10/11/21 12:49 AM, Matthew Petach wrote:
>
>
> Instead of a 4K stream, drop it to 480 or 240; the eyeball network
> should be happy at the reduced strain the resulting stream puts
> on their network.
>
> As a consumer paying for my 4k stream, I know who I'm calling when it
> drops to 480 and it ain't Netflix. The eyeballs are most definitely not
> happy.
>
> Mike
>

I apologize for that.  I was tired after two back-to-back days
of board meetings, and I missed putting a clear sarcasm
marker on that last line about "the eyeball networks
should be happy at the reduced strain...":(

There should have been a clear ;-P at the end of
the line to make it unmistakeable I was poking a
very sharp stick at the eyeball networks and
what it takes to actually make them happy.  ^_^;

Yes--the end consumers really shouldn't be the hostage
in this battle, being moved about the chess board by
either side, whether by their ISP trying to squeeze
more money out of the content side, or by the content
side trying to force more complaints into the service
desk of the ISP.

I mean, imagine this scenario for any other utility.

Pacific Gas and Electric calling up Hoover Dam to
say "hey, we're going to need to charge you some
additional money this month."

Hoover Dam: "...what?"

PGE: "well, you're sending a lot more electricity to
our customers this month, and we're going to have
to upgrade our power lines to handle it; and since
you're the one sending the electricity, you should
pay for part of the costs."

Hoover Dam: "...we're only sending enough electricity
to meet the demands YOUR customers are placing on
the grid.  If they want to run their air conditioners all
summer long, you need to charge them enough to
cover your costs for it."

Drat.  My analogy just ran out, because I realize the
dollars already flow the other way, and the hydroelectric
station would just laugh at PG and threaten to raise
the cost of the electricity simply for having to listen to their BS.   ^_^;

You can run the same scenario with your municipal water
company, and imagine how it would play out if the municipality
that put the pipes in to every home tried to charge the water
supplier more because homes were taking longer showers.

It's just such a fundamentally broken model, we laugh at it
in any other industry.  :(

Again, I'm sorry for being tired and missing the explicit
sarcasm indicator--not just for you, but for others who also
responded to that paragraph.   ^_^;

Thanks!

Matt

Re: S.Korea broadband firm sues Netflix after traffic surge

2021-10-11 Thread Matthew Petach

On Mon, Oct 11, 2021 at 1:01 AM Mark Tinka  wrote:

> However, in an era where content is making a push to get as close to the
> eyeballs as possible, kit getting cheaper and faster because of merchant
> silicon, and abundance of aggregated capacity at exchange points, can we
> leverage the shorter, faster links to change the model?
>
> Mark.
>

I think it would be absolutely *stunning* for content providers
to turn the model on its head; use a bittorrent like model for
caching and serving content out of subscribers homes at
recalcitrant ISPs, so that data doesn't come from outside,
it comes out of the mesh within the eyeball network, with
no clear place for the ISP to stick a $$$ bill to.

Imagine you've got a movie; you slice it into 1,000
encrypted chunks; you make part of your license
agreement for customers a requirement that they
will allow you to use up to 20GB of disk space on
their computer and to serve up data chunks into
the network in return for a slightly cheaper monthly
subscription cost to your service.  You put 1 slice
of that movie on each of 1,000 customers in a
network; then you replicate that across the next
thousand customers, and again for the next
thousand, until you've got enough replicas of
each shard to handle a particular household
going offline.  Your library is still safe from
piracy, no household has more than 1/1000th of a
movie locally, and they don't have the key to decrypt
it anyhow; but they've got 1/1000th of 4, different
movies, and when someone in that ISP wants to watch the
movie, the chunks are being fetched from other households
within the eyeball network.  The content provider would have
shard servers in region, able to serve up any missing shards that
can't be fetched locally within the ISP--but the idea would be that
as the number of subscribers within an ISP goes up, instead of the
ISP seeing a large, single-point-source increase in traffic, what they
see is an overall increase in east-west traffic among their users.

Because the "serving of shards to others" happens primarily while the
user is actively streaming content, you have a natural bell curve; during
peak streaming times, you have more nodes active to serve up shards,
handling the increased demand; at lower demand times, when fewer
people are active, and there's fewer home-nodes to serve shards, the
content network's shard servers can pick up the slack...but that'll
generally
only happen during lower traffic times, when the traffic won't be competing
and potentially causing pain for the ISP.

Really, it seems like a win-win scenario.

I'm confident we'll see a content network come out with a model like this
within the next 5 years, at which point the notion of blackmailing content
networks for additional $$$s will be a moot point, because the content will
be distributed and embedded within every major eyeball network already,
whether they like it or not, on their customer's devices.

Let's check back in 2026, and see if someone's become fantastically
successful doing this or not.  ;)

Thanks!

Matt

Re: DNS pulling BGP routes?

2021-10-11 Thread Matthew Petach

On Mon, Oct 11, 2021 at 8:07 AM Christopher Morrow 
wrote:

> On Sat, Oct 9, 2021 at 11:16 AM Masataka Ohta <
> mo...@necom830.hpcl.titech.ac.jp> wrote:
>
>> Bill Woodcock wrote:
>>
>
[...]

>
> it seems that the problem FB ran into was really that there wasn't either:
>"secondary path to communicate: "You are the last one standing, do not
> die"  (to an edge node)
>  or:
>   "maintain a very long/less-preferred path to a core location(s) to
> maintain service in case the CDN disappears"
>
> There are almost certainly more complexities which FB is not discussion in
> their design/deployment which
> affected their services last week, but it doesn't look like they were very
> far off on their deployment, if they
> need to maintain back-end connectivity to serve customers from the CDN
> locales.
>
> -chris
>

Having worked on trying to solve health-checking situations
in large production complexes in the past, I can definitely
say that is is an exponentially difficult problem for a single
site to determine whether it is "safe" for it to fail out, or if
doing so will result in an entire service going offline, short
of having a central controller which tracks every edge site's
health, and can determine "no, we're below $magic_threshold
number of sites, you can't fail yourself out no matter how
unhealthy you think you are".   Which of course you can't
really have, without undoing one of the key reasons for
distributing your serving sites to geographically distant
places in different buildings on different providers--namely
to eliminate single points of failure in your serving infrastructure.

Doing the equivalent of "no router bgp" on your core backbone
is going to make things suck, no matter how you slice it, and
I don't think any amount of tweaking the anycast setup or
DNS values would have made a whit of difference to the
underlying outage.

I think the only question we can armchair quarterback
at this point is whether there were prudent steps that
could go into a design to shorten the recovery interval.

So far, we seem to have collected a few key points:

1) make sure your disaster recovery plan doesn't depend
on your production DNS servers being usable; have
key nodes in /etc/hosts files that are periodically updated
via $automation_tool, but ONLY for non-production,
out-of-band recovery nodes; don't static any of your
production-facing entries.

2) Have a working out-of-band that exists entirely independent
of your production network.  Dial, frame relay, SMDS, LTE
modems, starlink dishes on the roof; pick your poison, but
budget it in for every production site.  Test it monthly to ensure
connectivity to all sites works.  Audit regularly to ensure no
dependencies on the production infrastructure have crept in.

3) Ensure you have a good "oh sh**" physical access plan for
key personnel.  Some of you at a recent virtual happy hour
heard me talk about the time I isolated the credit card payment
center for a $dayjob, which also cut off access for the card readers
to get into it to restore the network.   Use of a fire axe was granted
to on-site personnel during that.  Take the time to think through
how physical access is controlled for every key site in your network,
think about failure scenarios, and have a "in case of emergency,
break glass to get the key" plan in place to shorten recovery times.

4) Have a dependency map/graph of your production network.
 a) if everything dies, and you have to restart, what has to come up first?
 b) what dependencies are there that have to be done in the right order
 c) what services are independent that can be brought up in parallel to
speed
   up recovery?
d) does every team supporting services on the critical, dependent pathway
  have 24x7 on-call coverage, and do they know where in the recovery graph
  they're needed?  It doesn't help to have teams that can't start back up
until
  step 9 crowding around asking "are you ready for us yet?" when you still
can't
  raise the team needed for step 1 on the dependency graph.  ^_^;

5) do you know how close the nearest personnel are to each POP/CDN node,
   in case you have to do emergency "drive over with a laptop, hop on the
console,
   and issue the following commands" rousting in the middle of the night?
If someone
   lives.3 miles from the CDN node, it's good to know that, so you don't
call the person
   who is on-call but is 2 hours away without first checking if the person
3 miles away
   can do it faster.

I'm sure others have even better experiences than I, who can contribute
and add to the list.  If nothing else, perhaps collectively we can help
other companies prepare a bit better, so that when the next big "ooops"
happens, the recovery time can be a little bit shorter.   :)

Thanks!

Matt

Re: DNS pulling BGP routes?

2021-10-11 Thread Matthew Petach

On Sat, Oct 9, 2021 at 1:40 AM Masataka Ohta <
mo...@necom830.hpcl.titech.ac.jp> wrote:

> Christopher Morrow wrote:
> >> means their DNS servers were serving the zone, even after they
> >> recognize their zone data were too old, that is, expired.
>
> > that's not what this means. I think Mr. Petach previously described
> > this,
>
> He wrote:
>
> > So, the idea is that if the edge CDN node loses connectivity to
> > the core datacenters, the DNS servers should stop answering
> > queries for A records with the local CDN node's address, and
> > let a different site respond back to the client's DNS request.
>
> which may be performed by standard DNS with short expire period,
> after which name servers will return SERVFAIL and other name
> servers in other edge node with different IP addresses are tried.
>

(Apologies for the delayed response--I had back-to-back board
meetings the past two days which had me completely tied up.)

That is one way in which it *could* be done--but is by no means
the ONLY way in which it can be done.

With an anycast setup using the same IP addresses in every
location, returning SERVFAIL doesn't have the same effect,
however, because failing over from anycast address 1 to
anycast address 2 is likely to be routed to the same pop
location, where the same result will occur.

You don't really want to hunt among different *IP addresses*,
you want to hunt to a different *location*.

This is why withdrawing the BGP announcement from that
location works more effectively, because it allows the clients
to continue querying the same IP address, but get routed to
the next most proximal location.

If you simply return SERVFAIL and have the client pick a
different IP address from the list of NS entries, it falls into
one of two situations:
a) the new IP address is also anycasted, and is therefore
 likely to pick the same pop that is unhealthy, with similar
 results, or
b) the new IP address is *not* anycasted, but is served from
a single geographical location, which means answers given
back by that DNS server are unlikely to be geolocated with
any accuracy, and therefore the content served is also unlikely
to be geographically relevant or correct.

>
> It may be that facebook uses all the four name server IP addresses
> in each edge node. But, it effectively kills essential redundancy
> of DNS to have two or more name servers (at separate locations)
> and the natural consequence is, as you can see, mass disaster.
>

Even if the four anycasted nameserver IP addresses weren't
completely overlapping (let's assume as a hypothetical that
a.ns is served out of EU pops, b.ns is served out of NA pops,
c.ns is served out of SA pops, and d.ns is served out of APAC
pops), if all sites run the same healthcheck code, then if the
underlying healthcheck fails, *every site* will decide it is
unhealthy, and stop answering requests; so, all the EU sites
fail health check and stop serving a.ns; all the North America
sites fail health check, and stop serving b.ns...and so forth.

You followed the best practices, you had different NS entries
that were on different subnets, that were geographically
dispersed around the globe, that were redundant for each
other.  But because they all used the same fundamental
health check, they all *independently* decided they were
unhealthy and needed to stop giving out DNS answers,
and instead let one of the other healthier sites take over.

>
> > but: 1) dns server in pop serves some content (ttls aren't
> > important right now)
>
> You MUST distinguish TTL and EXPIRE. They are different.
>

TTL and EXPIRE are irrelevant here.
The only thing changing those values would do is change
how long it took for caching resolvers to reflect the loss of
connectivity at the DNS layer.  Once the underlying layer 3
connectivity had broken, DNS answers became meaningless.
No matter what records were returned, or cached, you couldn't
reach the servers.

Yes, yes, as an academic exercise you can point out that
there's a difference in how and when those DNS records
stop being used, and you're right about that--but in terms
of this particular failure, this particular post-mortem we're
beating to a horse-shaped pulp, it's entirely meaningless.   ^_^;

>
>  > there's not a lot of magic here... and it's not about the zone data
>  > really at all.
>
> Statement of Petach: "the edge CDN node loses connectivity to
> the core datacenters, the DNS servers should stop answering"
> means, with DNS terminology, zone data is expired, which has
> nothing to do with TTL.
>

As you're using my words, I'm going to have to point out that
"the DNS servers should stop answering" does not require that
any change happens *at the DNS layer* -- in this case, the
change can happen at the routing layer, ensuring that even
if some caching resolver out there is completely defiant of
your expire time, you *will not answer* because the query
packets can never reach you in the first place.

>

Re: S.Korea broadband firm sues Netflix after traffic surge

2021-10-11 Thread Matthew Petach

On Sun, Oct 10, 2021 at 2:44 PM Doug Barton  wrote:

> [some snipping below]
>
> Also just to be clear, these are my own opinions, not necessarily shared
> by any current or former employers.
>
> On 10/10/21 12:31 PM, Mark Tinka wrote:
> > On 10/10/21 21:08, Doug Barton wrote
> >> Given that issue, I have some sympathy for eyeball networks wanting to
> >> charge content providers for the increased capacity that is needed to
> >> bring in their content. The cost would be passed on to the content
> >> provider's customers...
> >
> > But eyeballs are already paying you a monthly fee for 100Mbps of service
> > (for example). So they should pay a surcharge, over-and-above that, that
> > determines how they can use that 100Mbps? Seems overly odd, to me.
>
> Yes, I get that. But as you pointed out here and in other comments, the
> ISP market is based entirely on undercutting competitors (with a lot of
> gambling thrown in, as Matthew pointed out).
>
>  [...]

> > So what rat hole does this lead us down into? People who want to stream
> > Youtube should pay their ISP for that? People who want to spend
> > unmentionable hours on Linkedin should be their ISP for that? People who
> > want to gawk over Samsung's web site because they love it so much,
> > should pay their ISP for that?
>
> First, I'm not saying "should." I'm saying that given the market
> economics, having the content providers who use "a lot" of bandwidth do
> something to offset those costs to the ISPs might be the best/least bad
> option. Whether "something" is a local cache box, peering, money, or
>  is something I think that the market should determine.
>

Going back to the fact that it's not the content providers "using"
a lot of bandwidth, it's the eyeball customer *requesting* a lot
of bandwidth, I think the best approach is for the content providers
to help manage traffic levels by lowering bit rates towards eyeball
networks that are feeling strained by their users.

Instead of a 4K stream, drop it to 480 or 240; the eyeball network
should be happy at the reduced strain the resulting stream puts
on their network.

The content network can even point out they're being a good
Network Citizen by putting up a brief banner at the top of the
stream saying "reducing bit rate to relieve stress on your ISPs
network".  That way, the happy customer knows that the
content provider is doing their part to help their ISP stay
profitable...I mean, doing their part to help the Internet
run better.

> And to answer Matthew's question, I don't know what "a lot" is. I think
> the market should determine that as well.
>

The market *is* determining that at the moment...but not in the
direction people expect.  Instead, it's creating a new market for
intermediaries; imagine you're an eyeball network that happens
to have peering with SKB, and largely inbound traffic flows.
Wouldn't it make sense for you to reach out to a player like
Netflix, and offer to host content cache boxes that happen to
only answer requests coming from SKB IP space, at a price
well below what SKB was going to charge the content provider?
As the eyeball network, you'd see your traffic ratios
balance out as the cache traffic filled your under-utilized outbound
port capacity, and you'd get a bit of additional revenue you otherwise
wouldn't get.  As the content provider, you're serving your customers
for a lower price than SKB wants to charge, and without giving into
SKB's extortion tactics.  It's a win-win-lose situation, in which the
content provider wins, the eyeball network that has a peering
relationship with SKB wins, and the only loser is SKB, which
doesn't get the additional revenue it was looking for, and actually
helps funnel money to a competitor that they otherwise wouldn't
have gotten.

I'm pretty sure this is going to start happening more and more,
as ISPs realize that putting content caches into their IP space
to serve not only their own customers, but also customers of
selected peers can be a source of good leverage in the market.

Matt

Re: S.Korea broadband firm sues Netflix after traffic surge

2021-10-10 Thread Matthew Petach

On Sun, Oct 10, 2021 at 12:12 PM Doug Barton  wrote:

> On 10/1/21 7:45 AM, Mark Tinka wrote:
> > The reason Google, Facebook, Microsoft, Amazon, e.t.c., all built their
> > own global backbones is because of this nonsense that SK Broadband is
> > trying to pull with Netflix. At some point, the content folk will get
> > fed up, and go build it themselves. What an opportunity infrastructure
> > cost itself!
>
> Except that Facebook, Microsoft, and Amazon all caved to SK's demands:
>

I will note that my $previous_employer was a top-10 web content provider
that did *not* pay SK Broadband.  Not all the content providers caved
to SKB.

> One incentive I haven't seen anyone mention is that ISPs don't want to
> charge customers what it really costs to provide them access. If you're
> the only one in your market that is doing that, no one is going to sign
> up because your pricing would be so far out of line with your competition.
>

That's a problem with your (collective) business model, then.

If you sell something for less than it costs to make, it's called a
loss-leader; and while you can do it for a little while, you'll get
very little sympathy if people take advantage of it to drain your
coffers.

If you sell a service for less than it costs to provide, simply
based on the hopes that people won't actually *use* it, that's
called "gambling", and I have very little sympathy for businesses
that gamble and lose.

> Given that issue, I have some sympathy for eyeball networks wanting to
> charge content providers for the increased capacity that is needed to
> bring in their content. The cost would be passed on to the content
> provider's customers (in the same way that corporations don't pay taxes,
> their customers do), so the people on that ISP who are creating the
> increased demand would be (indirectly) paying for the increased
> capacity. That's actually fairer for the other customers who aren't
> Netflix subscribers.
>

That argument makes no sense whatsoever.

What if instead of a single content provider, the extra traffic
was generated by 10,000 small websites, each adding 1/10,000th
of the volume of a single content provider?

The cumulative impact on the eyeball network to handle the
increased traffic is the same whether it comes from one
content provider or from 10,000 separate smaller websites.

Why should it be OK to go after the one content provider,
but not go after the 10,000 smaller websites?

At one point does your argument break down, and can you
defend why that break point makes sense?  Why is it OK to
go after one, two, three, four content providers, but not to
go after every website that is contributing to the increased
traffic volume the eyeball network is handling?

Seriously.  Make your case.
At what point do you draw that line, and say "we can charge
content sites if there's less than 5 of them, but not if there's
more than 10,000 of them?"
How do you defend the choice of where you drew that
arbitrary line?

The reason that Netflix doesn't want to do it is the same reason that
> ISPs don't want to charge their customers what it really costs to
> provide them access.
>

ISPs who don't charge enough to cover their costs are gambling,
and hoping they get lucky.

When they don't get lucky, and they lose their bet, they shouldn't
get to make up for it by trying to strong-arm others to make up the
difference.

if you decide that "sender pays" is a fair model for the Internet to
follow, then it needs to be applied equally, not just cherry-picking
a few companies to extort, but leaving everyone else alone.

As it stands, what you're arguing for is completely arbitrary and
unfair.

Matt

Re: DNS pulling BGP routes?

2021-10-06 Thread Matthew Petach

On Wed, Oct 6, 2021 at 10:45 AM Michael Thomas  wrote:

> So if I understand their post correctly, their DNS servers have the
> ability to withdraw routes if they determine are sub-optimal (fsvo). I
> can certainly understand for the DNS servers to not give answers they
> think are unreachable but there is always the problem that they may be
> partitioned and not the routes themselves. At a minimum, I would think
> they'd need some consensus protocol that says that it's broken across
> multiple servers.
>
> But I just don't understand why this is a good idea at all. Network
> topology is not DNS's bailiwick so using it as a trigger to withdraw
> routes seems really strange and fraught with unintended consequences.
> Why is it a good idea to withdraw the route if it doesn't seem reachable
> from the DNS server? Give answers that are reachable, sure, but to
> actually make a topology decision? Yikes. And what happens to the cached
> answers that still point to the supposedly dead route? They're going to
> fail until the TTL expires anyway so why is it preferable withdraw the
> route too?
>
> My guess is that their post while more clear that most doesn't go into
> enough detail, but is it me or does it seem like this is a really weird
> thing to do?
>
> Mike
>

Hi Mike,

You're kinda thinking about this from the wrong angle.

It's not that the route is withdrawn if doesn't seem reachable
from the DNS server.

It's that your DNS server is geolocating requests to the nearest
content delivery cluster, where the CDN cluster is likely fetching
content from a core datacenter elsewhere.  You don't want that
remote/edge CDN node to give back A records for a CDN node
that is isolated from the rest of the network and can't reach the
datacenter to fetch the necessary content; otherwise, you'll have
clients that reach the page, can load the static elements on the
page, but all the dynamic elements hang, waiting for a fetch to
complete from the origin which won't ever complete.  Not a very
good end user experience.

So, the idea is that if the edge CDN node loses connectivity to
the core datacenters, the DNS servers should stop answering
queries for A records with the local CDN node's address, and
let a different site respond back to the client's DNS request.
In particular, you really don't want the client to even send the
request to the edge CDN node that's been isolated, you want
to allow anycast to find the next-best edge site; so, once the
DNS servers fail the "can-I-reach-my-datacenter" health check,
they stop announcing the Anycast service address to the local
routers; that way, they drop out of the Anycast pool, and normal
Internet routing will ensure the client DNS requests are now sent
to the next-nearest edge CDN cluster for resolution and retrieving
data.

This works fine for ensuring that one or two edge sites that get
isolated due to fiber cuts don't end up pulling client requests into
them, and subsequently leaving the users hanging, waiting for
data that will never arrive.

However, it fails big-time if *all* sites fail their
"can-I-reach-the-datacenter"
check simultaneously.  When I was involved in the decision making
on a design like this, a choice was made to have a set of "really core"
sites in the middle of the network always announce the anycast prefixes,
as a fallback, so even if the routing wasn't optimal to reach them, the
users would still get *some* level of reply back.

In this situation, that would have ensured that at least some DNS
servers were reachable; but it wouldn't have fixed the "oh crap we
pushed 'no router bgp' out to all the routers at the same time" type
problem.  But that isn't really the core of your question, so we'll
just quietly push that aside for now.   ^_^;

Point being--it's useful and normal for edge sites that may become
isolated from the rest of the network to be configured to stop announcing
the Anycast service address for DNS out to local peers and transit
providers at that site during the period in which they are isolated, to
prevent users from being directed to CDN servers which can't fetch
content from the origin servers in the datacenter.  It's just generally
assumed that not every site will become "isolated" at the same time
like that.   :)

I hope this helps clear up the confusion.

Thanks!

Matt

Re: Facebook post-mortems...

2021-10-05 Thread Matthew Petach

On Tue, Oct 5, 2021 at 8:57 AM Kain, Becki (.)  wrote:

> Why ever would have a card reader on your external facing network, if that
> was really the case why they couldn't get in to fix it?
>

Let's hypothesize for a moment.

Let's suppose you've decided that certificate-based
authentication is the cat's meow, and so you've got
dot1x authentication on every network port in your
corporate environment, all your users are authenticated
via certificates, all properly signed all the way up the
chain to the root trust anchor.

Life is good.

But then you have a bad network day.  Suddenly,
you can't talk to upstream registries/registrars,
you can't reach the trust anchor for your certificates,
and you discover that all the laptops plugged into
your network switches are failing to validate their
authenticity; sure, you're on the network, but you're
in a guest vlan, with no access.  Your user credentials
aren't able to be validated, so you're stuck with the
base level of access, which doesn't let you into the
OOB network.

Turns out your card readers were all counting on
dot1x authentication to get them into the right vlan
as well, and with the network buggered up, the
switches can't validate *their* certificates either,
so the door badge card readers just flash their
LEDs impotently when you wave your badge at
them.

Remember, one attribute of certificates is that they are
designated as valid for a particular domain, or set of
subdomains with a wildcard; that is, an authenticator needs
to know where the certificate is being presented to know if
it is valid within that scope or not.   You can do that scope
validation through several different mechanisms,
such as through a chain of trust to a certificate authority,
or through DNSSEC with DANE--but fundamentally,
all certificates have a scope within which they are valid,
and a means to identify in which scope they are being
used.  And wether your certificate chain of trust is
being determined by certificate authorities or DANE,
they all require that trust to be validated by something
other than the client and server alone--which generally
makes them dependent on some level of external
network connectivity being present in order to properly
function.   [yes, yes, we can have a side discussion about
having every authentication server self-sign certificates
as its own CA, and thus eliminate external network
connectivity dependencies--but that's an administrative
nightmare that I don't think any large organization would
sign up for.]

So, all of the client certificates and authorization servers
we're talking about exist on your internal network, but they
all counted on reachability to your infrastructure
servers in order to properly authenticate and grant
access to devices and people.  If your BGP update
made your infrastructure servers, such as DNS servers,
become unreachable, then suddenly you might well
find yourself locked out both physically and logically
from your own network.

Again, this is purely hypothetical, but it's one scenario
in which a routing-level "oops" could end up causing
physical-entry denial, as well as logical network access
level denial, without actually having those authentication
systems on external facing networks.

Certificate-based authentication is scalable and cool, but
it's really important to think about even generally "that'll
never happen" failure scenarios when deploying it into
critical systems.  It's always good to have the "break glass
in case of emergency" network that doesn't rely on dot1x,
that works without DNS, without NTP, without RADIUS,
or any other external system, with a binder with printouts
of the IP addresses of all your really critical servers and
routers in it which gets updated a few times a year, so that
when the SHTF, a person sitting at a laptop plugged into
that network with the binder next to them can get into the
emergency-only local account on each router to fix things.

And yes, you want every command that local emergency-only
user types into a router to be logged, because someone
wanting to create mischief in your network is going to aim
for that account access if they can get it; so watch it like a
hawk, and the only time it had better be accessed and used
is when the big red panic button has already been hit, and
the executives are huddled around speakerphones wanting
to know just how fast you can get things working again.  ^_^;

I know nothing of the incident in question.  But sitting at home,
hypothesizing about ways in which things could go wrong, this
is one of the reasons why I still configure static emergency
accounts on network devices, even with centrally administered
account systems, and why there's always a set of "no dot1x"
ports that work to get into the OOB/management network even
when everything else has gone toes-up.   :)

So--that's one way in which an outage like this could have
locked people out of buildings.   ^_^;

Thanks!

Matt
[ready for the deluge of people pointing out I've

Re: massive facebook outage presently

2021-10-04 Thread Matthew Petach

On Mon, Oct 4, 2021 at 11:59 AM Jason Kuehl  wrote:

> I mean, you're an idiot if you post that public on the internet about
> your own place of work. What do you think would happen? Nothing? He should
> never of said anything, but now the Facebook hitman got him.
>
>
Some of us have done that, and survived[0].

But I would be the first to admit I've led a
very charmed life in that regard.   ^_^;

Matt

[0]
https://www.computerworld.com/article/2529621/networking-glitch-knocks-yahoo-offline-for-some.html

1 2 3 4 5 6 >

1 - 100 of 559 matches

Mail list logo