Re: "Tactical" /24 announcements

2021-08-19 Thread Ben Maddison via NANOG
Hi David,

On 08/19, David Bass wrote:
> Ben,
> 
> Yes, sorry.
> 
> Pulling/pushing the config data to a server, and then managing it there in
> addition to on the box.  Like, if I want to run some reports to see how
> many PL are defined on each box, it’s easier to do that with the data
> centralized and managed.
> 
Thanks for clarifying.

A bit of additional context:

We build and push whole device configs, doing a full replace on every
change.
The configs are built from centralized, version controlled data which
describes devices connectivity, customer services, etc, etc, etc.
On every change, we retrieve a diff from the device (e.g. show arch
config diff ... on IOS).

Having the *contents* of IRR derived prefix-lists in the configs has two
major downsides:
- it makes the config dependent on data that we don't own (i.e. a box
  gets a new config even though we didn't change any of our internal
  data), which in turn makes the diffs large and noisy; and
- the size of the generated configs is huge, which slows down deployment
  and makes the whole process fragile.

The tool I mentioned allows us to put a single line (syntactically
equivalent to an empty prefix list) in the generated config. The agent
"sees" that line, and fills in the details, keeping it up to date.
The contents of the list never show up in a "show run", keeping noise
levels down.

There are ultimately three sources of policy data that contribute to the
runtime operation of a device:
- config pushed from our deployment tools
- rpki-rtr data
- prefix-list contents, from our mirrors on the various IRR DBs

If I need to know what prefix lists are on a given box, and what they
contain, I can simply look at those data sources directly.

The key to reliability here is to share as much logic between
operational tools as possible, so that you can be confident that the
"ad-hoc troubleshooting tool" gives an answer that is consistent with
the "config generation tool".

Hope that kinda answers the question?

Cheers,

Ben


signature.asc
Description: PGP signature


Re: "Tactical" /24 announcements

2021-08-19 Thread David Bass
Ben,

Yes, sorry.

Pulling/pushing the config data to a server, and then managing it there in
addition to on the box.  Like, if I want to run some reports to see how
many PL are defined on each box, it’s easier to do that with the data
centralized and managed.

David

On Thu, Aug 19, 2021 at 6:35 AM Ben Maddison  wrote:

> Hi David,
>
> On 08/18, David Bass wrote:
> > I'm also in the externally managed space...very cool tool though.  I love
> > the idea of distributing some of this functionality.
> >
> > Are you also exporting and managing this data outside?
> >
> [assuming that was directed to me...]
>
> I'm not sure what you mean by "exporting and managing this data
> outside".
> Would you elaborate?
>
> Cheers,
>
> Ben
>


Re: "Tactical" /24 announcements

2021-08-19 Thread Ben Maddison via NANOG
Hi Randy,

On 08/17, Randy Bush wrote:
> for junos, i build the prefix list externally and push config.  sad to
> say, the code is so old ('90s) that it's pearl and uses `peval`.  i
> should fix but (copious spare time) == 0.
> 
Spare time must be > 0 if you're willing to wait for peval to finish ;-)

> originally i tried to also build and push for cisco ios classic, but it
> died in the push.  breathe on the router and it reset bgp sessions.  i
> gather from heas that things are better these years.
> 
Better, but not good (or even tolerable). There is a reason I didn't
provide an example of doing this kind of thing on IOS classic/XE.

> i guess i really should have a go at doing it for arcos, but ...
> 
The thing that EOS and JUNOS have in common that allows this to work is
a mechanism to store config state outside the main "running config".

In JUNOS that's the ephemeral DBs, in EOS they call it "URL based
import" for as-path and prefix lists.

If ArcOS doesn't already have something similar, I'd get it on the list.

Cheers,

Ben


signature.asc
Description: PGP signature


Re: "Tactical" /24 announcements

2021-08-19 Thread Ben Maddison via NANOG
Hi David,

On 08/18, David Bass wrote:
> I'm also in the externally managed space...very cool tool though.  I love
> the idea of distributing some of this functionality.
> 
> Are you also exporting and managing this data outside?
> 
[assuming that was directed to me...]

I'm not sure what you mean by "exporting and managing this data
outside".
Would you elaborate?

Cheers,

Ben


signature.asc
Description: PGP signature


Re: "Tactical" /24 announcements

2021-08-18 Thread David Bass
I'm also in the externally managed space...very cool tool though.  I love
the idea of distributing some of this functionality.

Are you also exporting and managing this data outside?

On Tue, Aug 17, 2021 at 12:23 PM Ben Maddison via NANOG 
wrote:

> Hi Saku,
>
> On 08/17, Saku Ytti wrote:
> > I share your confusion Randy. It seems like perhaps Jakob answered a
> > slightly different question and his answer is roughly.
> >
> > a) Use this as-set feature to ensure valid set of ASNs from given peer
> > b) Validate prefix using RPKI (I'm assuming with rejecting unknowns
> > and invalids)
> > c) Don't punch in prefix-lists anywhere
> >
> > Which in theory works, but in practice it does not, as RPKI validity
> > cover is incomplete.
> >
> This, and (more fundamentally) RPKI-breakage gets translated into a
> dataplane
> outage.
>
> > Somewhat related, when JNPR implemented RTR the architecture was
> > planned so that the RTR implementation itself isn't tightly coupled to
> > RPKI validity. It was planned day1 that customers could have multiple
> > RTR setups feeding prefixes and the NOS side could use these for other
> > purposes too. So technically JNPR is mostly missing CLI work to allow
> > you to feed prefix-lists dynamically over RTR, instead of punching
> > them in vendor-specific way in config.
> >
> We already do essentially this on arista EOS using a custom agent.
>
> It runs under the EOS process supervisor and calls home to a REST-API
> wrapper around bgpq3. It looks for specific config lines to work out
> which prefix lists to build, and then fetches them on a configurable
> interval.
>
> This has been in production for a year or two, without major incident.
> It's all open source, available at
> https://github.com/wolcomm/eos-prefix-list-agent.
> Pull-requests
>  welcomed
> ;-)
>
> I'm in the middle of writing the equivalent tool for junos at the
> moment. Assuming that it works, we'll open source that too.
>
> HTH,
>
> Ben
>


Re: "Tactical" /24 announcements

2021-08-17 Thread Tim Raphael
We do something similar - build the prefix lists externally (based on
PeeringDB, IRR, RPKI data) and push them with config management on regular
intervals.
This sort of automated policy architecture is clearly becoming more common,
and the drive (see: MANRS) is ever-increasing.
I'd really like some sort of dynamic, standard method to achieve this
off-box.

> It's all open source, available at
> https://github.com/wolcomm/eos-prefix-list-agent

Very neat indeed!

- Tim

On Wed, Aug 18, 2021 at 2:45 AM Randy Bush  wrote:

> for junos, i build the prefix list externally and push config.  sad to
> say, the code is so old ('90s) that it's pearl and uses `peval`.  i
> should fix but (copious spare time) == 0.
>
> originally i tried to also build and push for cisco ios classic, but it
> died in the push.  breathe on the router and it reset bgp sessions.  i
> gather from heas that things are better these years.
>
> i guess i really should have a go at doing it for arcos, but ...
>
> > It's all open source, available at
> > https://github.com/wolcomm/eos-prefix-list-agent
>
> very cool.
>
> randy
>


RE: "Tactical" /24 announcements

2021-08-17 Thread Jakob Heitz (jheitz) via NANOG
Oh, and your other issue. IOS-XR has two modes in which you can use
RPKI validity. One is where the router automatically uses the
validity. The other mode is where you use the validity in any
way you want in route-policy.

Regards,
Jakob.

-Original Message-
From: Jakob Heitz (jheitz) 
Sent: Tuesday, August 17, 2021 9:59 AM
To: nanog@nanog.org
Subject: RE: "Tactical" /24 announcements

> RPKI validity cover is incomplete.
One way: add your own RTR records. They don't all have to come from
the RPKI.
Another way: Add route-policy to validate the origin-as.
That requires a prefix-set. However, these prefix-sets are much smaller
and the sum of them is smaller than the sum of prefix-sets you would
use on your neighbor sessions.

Regards,
Jakob.

-Original Message-
Date: Tue, 17 Aug 2021 09:22:01 +0300
From: Saku Ytti 

I share your confusion Randy. It seems like perhaps Jakob answered a
slightly different question and his answer is roughly.

a) Use this as-set feature to ensure valid set of ASNs from given peer
b) Validate prefix using RPKI (I'm assuming with rejecting unknowns
and invalids)
c) Don't punch in prefix-lists anywhere

Which in theory works, but in practice it does not, as RPKI validity
cover is incomplete.

Somewhat related, when JNPR implemented RTR the architecture was
planned so that the RTR implementation itself isn't tightly coupled to
RPKI validity. It was planned day1 that customers could have multiple
RTR setups feeding prefixes and the NOS side could use these for other
purposes too. So technically JNPR is mostly missing CLI work to allow
you to feed prefix-lists dynamically over RTR, instead of punching
them in vendor-specific way in config.

I really hope JNPR does that work, I really like the appeal of doing
things off-box and using the same protocol to talk to on-box. Also,
give me gRPC/protobuf route policy API, so I can write my route-policy
in a real programming language once for all my NOS.


On Mon, 16 Aug 2021 at 20:32, Randy Bush  wrote:
>
> hi jakob,
>
> i am confused between
>
> > There is no expansion to prefix-set.
>
> and your earlier
>
> >> We have introduced the scalable as-set into the XR route policy language.
> >> as-path-set does not scale well with 1000's of ASNs.
> >> Now, you don't need to expand AS-SET into prefix-set, just enter it 
> >> directly.
>
> expanding AS-SET into prefix filters is exactly what we do.
>
> ```
> % peval -s RIPE AS-RG-SEA
> ({198.180.153.0/24, 198.180.151.0/24, 147.28.8.0/24, 147.28.9.0/24, 
> 147.28.10.0/24, 147.28.11.0/24, 147.28.12.0/24, 147.28.13.0/24, 
> 147.28.14.0/24, 147.28.15.0/24, 147.28.4.0/24, 147.28.5.0/24, 147.28.6.0/24, 
> 147.28.7.0/24, 147.28.2.0/24, 147.28.3.0/24, 147.28.0.0/23, 45.132.188.0/24, 
> 45.132.189.0/24, 45.132.190.0/24, 45.132.191.0/24})
> ```
>
> i do not see how to get around this.  clue bat please
>
> randy



-- 
  ++ytti


RE: "Tactical" /24 announcements

2021-08-17 Thread Jakob Heitz (jheitz) via NANOG
> RPKI validity cover is incomplete.
One way: add your own RTR records. They don't all have to come from
the RPKI.
Another way: Add route-policy to validate the origin-as.
That requires a prefix-set. However, these prefix-sets are much smaller
and the sum of them is smaller than the sum of prefix-sets you would
use on your neighbor sessions.

Regards,
Jakob.

-Original Message-
Date: Tue, 17 Aug 2021 09:22:01 +0300
From: Saku Ytti 

I share your confusion Randy. It seems like perhaps Jakob answered a
slightly different question and his answer is roughly.

a) Use this as-set feature to ensure valid set of ASNs from given peer
b) Validate prefix using RPKI (I'm assuming with rejecting unknowns
and invalids)
c) Don't punch in prefix-lists anywhere

Which in theory works, but in practice it does not, as RPKI validity
cover is incomplete.

Somewhat related, when JNPR implemented RTR the architecture was
planned so that the RTR implementation itself isn't tightly coupled to
RPKI validity. It was planned day1 that customers could have multiple
RTR setups feeding prefixes and the NOS side could use these for other
purposes too. So technically JNPR is mostly missing CLI work to allow
you to feed prefix-lists dynamically over RTR, instead of punching
them in vendor-specific way in config.

I really hope JNPR does that work, I really like the appeal of doing
things off-box and using the same protocol to talk to on-box. Also,
give me gRPC/protobuf route policy API, so I can write my route-policy
in a real programming language once for all my NOS.


On Mon, 16 Aug 2021 at 20:32, Randy Bush  wrote:
>
> hi jakob,
>
> i am confused between
>
> > There is no expansion to prefix-set.
>
> and your earlier
>
> >> We have introduced the scalable as-set into the XR route policy language.
> >> as-path-set does not scale well with 1000's of ASNs.
> >> Now, you don't need to expand AS-SET into prefix-set, just enter it 
> >> directly.
>
> expanding AS-SET into prefix filters is exactly what we do.
>
> ```
> % peval -s RIPE AS-RG-SEA
> ({198.180.153.0/24, 198.180.151.0/24, 147.28.8.0/24, 147.28.9.0/24, 
> 147.28.10.0/24, 147.28.11.0/24, 147.28.12.0/24, 147.28.13.0/24, 
> 147.28.14.0/24, 147.28.15.0/24, 147.28.4.0/24, 147.28.5.0/24, 147.28.6.0/24, 
> 147.28.7.0/24, 147.28.2.0/24, 147.28.3.0/24, 147.28.0.0/23, 45.132.188.0/24, 
> 45.132.189.0/24, 45.132.190.0/24, 45.132.191.0/24})
> ```
>
> i do not see how to get around this.  clue bat please
>
> randy



-- 
  ++ytti


Re: "Tactical" /24 announcements

2021-08-17 Thread Randy Bush
for junos, i build the prefix list externally and push config.  sad to
say, the code is so old ('90s) that it's pearl and uses `peval`.  i
should fix but (copious spare time) == 0.

originally i tried to also build and push for cisco ios classic, but it
died in the push.  breathe on the router and it reset bgp sessions.  i
gather from heas that things are better these years.

i guess i really should have a go at doing it for arcos, but ...

> It's all open source, available at
> https://github.com/wolcomm/eos-prefix-list-agent

very cool.

randy


Re: "Tactical" /24 announcements

2021-08-17 Thread Ben Maddison via NANOG
Hi Saku,

On 08/17, Saku Ytti wrote:
> I share your confusion Randy. It seems like perhaps Jakob answered a
> slightly different question and his answer is roughly.
> 
> a) Use this as-set feature to ensure valid set of ASNs from given peer
> b) Validate prefix using RPKI (I'm assuming with rejecting unknowns
> and invalids)
> c) Don't punch in prefix-lists anywhere
> 
> Which in theory works, but in practice it does not, as RPKI validity
> cover is incomplete.
> 
This, and (more fundamentally) RPKI-breakage gets translated into a dataplane
outage.

> Somewhat related, when JNPR implemented RTR the architecture was
> planned so that the RTR implementation itself isn't tightly coupled to
> RPKI validity. It was planned day1 that customers could have multiple
> RTR setups feeding prefixes and the NOS side could use these for other
> purposes too. So technically JNPR is mostly missing CLI work to allow
> you to feed prefix-lists dynamically over RTR, instead of punching
> them in vendor-specific way in config.
> 
We already do essentially this on arista EOS using a custom agent.

It runs under the EOS process supervisor and calls home to a REST-API
wrapper around bgpq3. It looks for specific config lines to work out
which prefix lists to build, and then fetches them on a configurable
interval.

This has been in production for a year or two, without major incident.
It's all open source, available at 
https://github.com/wolcomm/eos-prefix-list-agent.
Pull-requests welcomed ;-)

I'm in the middle of writing the equivalent tool for junos at the
moment. Assuming that it works, we'll open source that too.

HTH,

Ben


signature.asc
Description: PGP signature


Re: "Tactical" /24 announcements

2021-08-17 Thread Randy Bush
> Somewhat related, when JNPR implemented RTR the architecture was
> planned so that the RTR implementation itself isn't tightly coupled to
> RPKI validity. It was planned day1 that customers could have multiple
> RTR setups feeding prefixes and the NOS side could use these for other
> purposes too.

back in the day, the developing RP-rtr spec assumed multiple sources,
RPKI, IRR, ...  there was also an open-ended operator defined Color

0  8  16 2431
.---.
| Protocol |   PDU| |
| Version  |   Type   |Color|
|0 |4 | |
+---+
|   |
| Length=20 |
|   |
+---+
|  |  Prefix  |   Max|  Data|
|  Flags   |  Length  |  Length  |  Source  |
|  |   0..32  |   0..32  | RPKI/IRR |
+---+
|   |
|IPv4 prefix|
|   |
+---+
|   |
| Autonomous System Number  |
|   |
`---'

until, at ietf maastricht (2010), the unfortunate actions of a bully who
wanted more and more caused us to walk away, give up, and fall back to
the simpler basic we have today.

0  8  16 2431
.---.
| Protocol |   PDU| |
| Version  |   Type   |reserved = zero  |
|0 |4 | |
+---+
|   |
| Length=20 |
|   |
+---+
|  |  Prefix  |   Max|  |
|  Flags   |  Length  |  Length  |   zero   |
|  |   0..32  |   0..32  |  |
+---+
|   |
|IPv4 Prefix|
|   |
+---+
|   |
| Autonomous System Number  |
|   |
`---'


randy


Re: "Tactical" /24 announcements

2021-08-17 Thread Tim Raphael
I quite like this approach as well - for those that would like to do more 
complicated policy logic off-box, the RTR architecture very much lends itself 
to that.

JNPR already has accessible APIs (JET-based / RPC) you can leverage to push 
configuration into the ephemeral database or be called on certain events (e.g. 
prefix learn). This, however comes with the acceptance of quite a few other 
risks. RTR could be used to signal other prefix options which would potentially 
remove the risks of dealing with the ephemeral config construct for certain 
use-cases, e.g. complex peer prefix filtering. 

- Tim

> On 17 Aug 2021, at 16:24, Saku Ytti  wrote:
> 
> I share your confusion Randy. It seems like perhaps Jakob answered a
> slightly different question and his answer is roughly.
> 
> a) Use this as-set feature to ensure valid set of ASNs from given peer
> b) Validate prefix using RPKI (I'm assuming with rejecting unknowns
> and invalids)
> c) Don't punch in prefix-lists anywhere
> 
> Which in theory works, but in practice it does not, as RPKI validity
> cover is incomplete.
> 
> Somewhat related, when JNPR implemented RTR the architecture was
> planned so that the RTR implementation itself isn't tightly coupled to
> RPKI validity. It was planned day1 that customers could have multiple
> RTR setups feeding prefixes and the NOS side could use these for other
> purposes too. So technically JNPR is mostly missing CLI work to allow
> you to feed prefix-lists dynamically over RTR, instead of punching
> them in vendor-specific way in config.
> 
> I really hope JNPR does that work, I really like the appeal of doing
> things off-box and using the same protocol to talk to on-box. Also,
> give me gRPC/protobuf route policy API, so I can write my route-policy
> in a real programming language once for all my NOS.
> 
> 
>> On Mon, 16 Aug 2021 at 20:32, Randy Bush  wrote:
>> 
>> hi jakob,
>> 
>> i am confused between
>> 
>>> There is no expansion to prefix-set.
>> 
>> and your earlier
>> 
 We have introduced the scalable as-set into the XR route policy language.
 as-path-set does not scale well with 1000's of ASNs.
 Now, you don't need to expand AS-SET into prefix-set, just enter it 
 directly.
>> 
>> expanding AS-SET into prefix filters is exactly what we do.
>> 
>> ```
>> % peval -s RIPE AS-RG-SEA
>> ({198.180.153.0/24, 198.180.151.0/24, 147.28.8.0/24, 147.28.9.0/24, 
>> 147.28.10.0/24, 147.28.11.0/24, 147.28.12.0/24, 147.28.13.0/24, 
>> 147.28.14.0/24, 147.28.15.0/24, 147.28.4.0/24, 147.28.5.0/24, 147.28.6.0/24, 
>> 147.28.7.0/24, 147.28.2.0/24, 147.28.3.0/24, 147.28.0.0/23, 45.132.188.0/24, 
>> 45.132.189.0/24, 45.132.190.0/24, 45.132.191.0/24})
>> ```
>> 
>> i do not see how to get around this.  clue bat please
>> 
>> randy
> 
> 
> 
> -- 
>  ++ytti


Re: "Tactical" /24 announcements

2021-08-17 Thread Saku Ytti
I share your confusion Randy. It seems like perhaps Jakob answered a
slightly different question and his answer is roughly.

a) Use this as-set feature to ensure valid set of ASNs from given peer
b) Validate prefix using RPKI (I'm assuming with rejecting unknowns
and invalids)
c) Don't punch in prefix-lists anywhere

Which in theory works, but in practice it does not, as RPKI validity
cover is incomplete.

Somewhat related, when JNPR implemented RTR the architecture was
planned so that the RTR implementation itself isn't tightly coupled to
RPKI validity. It was planned day1 that customers could have multiple
RTR setups feeding prefixes and the NOS side could use these for other
purposes too. So technically JNPR is mostly missing CLI work to allow
you to feed prefix-lists dynamically over RTR, instead of punching
them in vendor-specific way in config.

I really hope JNPR does that work, I really like the appeal of doing
things off-box and using the same protocol to talk to on-box. Also,
give me gRPC/protobuf route policy API, so I can write my route-policy
in a real programming language once for all my NOS.


On Mon, 16 Aug 2021 at 20:32, Randy Bush  wrote:
>
> hi jakob,
>
> i am confused between
>
> > There is no expansion to prefix-set.
>
> and your earlier
>
> >> We have introduced the scalable as-set into the XR route policy language.
> >> as-path-set does not scale well with 1000's of ASNs.
> >> Now, you don't need to expand AS-SET into prefix-set, just enter it 
> >> directly.
>
> expanding AS-SET into prefix filters is exactly what we do.
>
> ```
> % peval -s RIPE AS-RG-SEA
> ({198.180.153.0/24, 198.180.151.0/24, 147.28.8.0/24, 147.28.9.0/24, 
> 147.28.10.0/24, 147.28.11.0/24, 147.28.12.0/24, 147.28.13.0/24, 
> 147.28.14.0/24, 147.28.15.0/24, 147.28.4.0/24, 147.28.5.0/24, 147.28.6.0/24, 
> 147.28.7.0/24, 147.28.2.0/24, 147.28.3.0/24, 147.28.0.0/23, 45.132.188.0/24, 
> 45.132.189.0/24, 45.132.190.0/24, 45.132.191.0/24})
> ```
>
> i do not see how to get around this.  clue bat please
>
> randy



-- 
  ++ytti


Re: "Tactical" /24 announcements

2021-08-16 Thread Randy Bush
hi jakob,

i am confused between

> There is no expansion to prefix-set.

and your earlier

>> We have introduced the scalable as-set into the XR route policy language.
>> as-path-set does not scale well with 1000's of ASNs.
>> Now, you don't need to expand AS-SET into prefix-set, just enter it directly.

expanding AS-SET into prefix filters is exactly what we do.

```
% peval -s RIPE AS-RG-SEA
({198.180.153.0/24, 198.180.151.0/24, 147.28.8.0/24, 147.28.9.0/24, 
147.28.10.0/24, 147.28.11.0/24, 147.28.12.0/24, 147.28.13.0/24, 147.28.14.0/24, 
147.28.15.0/24, 147.28.4.0/24, 147.28.5.0/24, 147.28.6.0/24, 147.28.7.0/24, 
147.28.2.0/24, 147.28.3.0/24, 147.28.0.0/23, 45.132.188.0/24, 45.132.189.0/24, 
45.132.190.0/24, 45.132.191.0/24})
```

i do not see how to get around this.  clue bat please

randy


Re: "Tactical" /24 announcements

2021-08-16 Thread Tom Beecher
Broadly speaking, I would say if you announce a prefix to the DFZ, then you
are saying "I can deliver anything in this range where it is supposed to
go."

That being said, there are moments like Bill said that an outage or other
issue prevents that from happening, and also circumstances that a lack of
competence also creates a problem.

On Mon, Aug 16, 2021 at 12:07 PM William Herrin  wrote:

> On Mon, Aug 16, 2021 at 7:10 AM Jason Pope  wrote:
> >
> > >On Thu, Aug 12, 2021 at 9:41 AM Hank Nussbacher 
> wrote:
> > >> How does this break the Internet?
> > >
> > >A originates 10.0.0.0/16 to paid transit C
> > >B originates 10.0.1.0/24 also to paid transit C
> > >C offers both routes to D. D discards 10.0.1.0/24 from the RIB based
> > >on same-next-hop
> > >You peer with A and D. You receive only 10.0.0.0/16 since A doesn't
> > >originate 10.0.1.0/24 and D has discarded it.
> > >You send packets for 10.0.1.0/24 to A (the shortest path for
> > >10.0.0.0/16), stealing A's paid transit to C to get to B.
> > >Unless A filters C-bound packets purportedly from 10.0.1.0/24. B
> > >doesn't currently transit for A so from B's perspective that's not an
> > >allowed path. In which case, your path to 10.0.1.0/24 is black holed.
> > >
> > >D broke the Internet. If packets from you reach A at all, they do so
> > >through an unpermitted path.
> >
> > Ok, I apologize, but I have some dumb questions (because I don't BGP
> anymore):
> >
> > 1) I assume in the scenario that A "owns" (ARIN assignment) 10.0.0.0/16
> and if B has a /24 assignment out of the block that A "owns", shouldn't
> that mean that B has a business relationship with A and some kind of direct
> connectivity to A?
>
> Hi Jason,
>
> Not necessarily. It isn't modern practice but as others have pointed
> out there have been instances where a customer took an ISP-assigned
> block with them when they left.
>
> > 3) If "yes", then the connectivity wouldn't be broken, right?
>
> Not necessarily. You have to consider the route in -all- of the states
> it can be in, including the one where they're not, at this moment,
> successfully connected to the ISP which assigned the addresses. I
> offered a scenario in a prior post where the ISP's peering router
> carries only locally-originated and customer routes. When the customer
> loses their connection to the ISP (e.g. cable cut) their route
> disappears from the peering router. The users of the ISP can still
> reach it via the origin's alternate Internet connection.
>
> Reciprocal peers of the ISP can also reach it via the broader Internet
> but can't reach it via the peering connection to the ISP to whom the
> origin is not currently connected. If they filter the Internet route,
> the path ends up going to the ISP's peering router where it's black
> holed.
>
> Regards,
> Bill Herrin
>
>
>
> --
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/
>


Re: "Tactical" /24 announcements

2021-08-16 Thread William Herrin
On Mon, Aug 16, 2021 at 7:10 AM Jason Pope  wrote:
>
> >On Thu, Aug 12, 2021 at 9:41 AM Hank Nussbacher  wrote:
> >> How does this break the Internet?
> >
> >A originates 10.0.0.0/16 to paid transit C
> >B originates 10.0.1.0/24 also to paid transit C
> >C offers both routes to D. D discards 10.0.1.0/24 from the RIB based
> >on same-next-hop
> >You peer with A and D. You receive only 10.0.0.0/16 since A doesn't
> >originate 10.0.1.0/24 and D has discarded it.
> >You send packets for 10.0.1.0/24 to A (the shortest path for
> >10.0.0.0/16), stealing A's paid transit to C to get to B.
> >Unless A filters C-bound packets purportedly from 10.0.1.0/24. B
> >doesn't currently transit for A so from B's perspective that's not an
> >allowed path. In which case, your path to 10.0.1.0/24 is black holed.
> >
> >D broke the Internet. If packets from you reach A at all, they do so
> >through an unpermitted path.
>
> Ok, I apologize, but I have some dumb questions (because I don't BGP anymore):
>
> 1) I assume in the scenario that A "owns" (ARIN assignment) 10.0.0.0/16 and 
> if B has a /24 assignment out of the block that A "owns", shouldn't that mean 
> that B has a business relationship with A and some kind of direct 
> connectivity to A?

Hi Jason,

Not necessarily. It isn't modern practice but as others have pointed
out there have been instances where a customer took an ISP-assigned
block with them when they left.

> 3) If "yes", then the connectivity wouldn't be broken, right?

Not necessarily. You have to consider the route in -all- of the states
it can be in, including the one where they're not, at this moment,
successfully connected to the ISP which assigned the addresses. I
offered a scenario in a prior post where the ISP's peering router
carries only locally-originated and customer routes. When the customer
loses their connection to the ISP (e.g. cable cut) their route
disappears from the peering router. The users of the ISP can still
reach it via the origin's alternate Internet connection.

Reciprocal peers of the ISP can also reach it via the broader Internet
but can't reach it via the peering connection to the ISP to whom the
origin is not currently connected. If they filter the Internet route,
the path ends up going to the ISP's peering router where it's black
holed.

Regards,
Bill Herrin



-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: "Tactical" /24 announcements

2021-08-16 Thread Jason Pope
>On Thu, Aug 12, 2021 at 9:41 AM Hank Nussbacher 
wrote:
>> On 12/08/2021 17:59, William Herrin wrote:
>> > If you prune the routes from the Routing Information Base instead, for
>> > any widely accepted size (i.e. /24 or shorter netmask) you break the
>> > Internet.
>>
>> How does this break the Internet?  I would think it would just result in
>> sub-optimal routing (provided there is a covering larger prefix) but
>> everything should continue to work.  Clue me in, please.
>
>A originates 10.0.0.0/16 to paid transit C
>B originates 10.0.1.0/24 also to paid transit C
>C offers both routes to D. D discards 10.0.1.0/24 from the RIB based
>on same-next-hop
>You peer with A and D. You receive only 10.0.0.0/16 since A doesn't
>originate 10.0.1.0/24 and D has discarded it.
>You send packets for 10.0.1.0/24 to A (the shortest path for
>10.0.0.0/16), stealing A's paid transit to C to get to B.
>Unless A filters C-bound packets purportedly from 10.0.1.0/24. B
>doesn't currently transit for A so from B's perspective that's not an
>allowed path. In which case, your path to 10.0.1.0/24 is black holed.
>
>D broke the Internet. If packets from you reach A at all, they do so
>through an unpermitted path.
>
>Regards,
>Bill Herrin

Ok, I apologize, but I have some dumb questions (because I don't BGP
anymore):

1) I assume in the scenario that A "owns" (ARIN assignment) 10.0.0.0/16 and
if B has a /24 assignment out of the block that A "owns", shouldn't that
mean that B has a business relationship with A and some kind of direct
connectivity to A?

2) If "no", then why is B using a /24 out of A's block? If A sold or gave
the block to B without a connectivity agreement, then A should break up
their announcements appropriately to carve the /24 out of their
announcement, right?

3) If "yes", then the connectivity wouldn't be broken, right?

TIA for the tutoring and bearing with me.

Regards,
Jason K Pope


Re: "Tactical" /24 announcements

2021-08-16 Thread Tim Weippert
Hi Jakob, 

but the as-set only checks the origin AS in the announcement, it doesn't
lookup the prefix <-> as relation from RADB/RIPE/Whatever, as i understand it
correctly!

Or is there some lookup mechanism as Ytti/Mark mentioned?

regards, 
tim

On Sun, Aug 15, 2021 at 01:46:40AM +, Jakob Heitz (jheitz) via NANOG wrote:
> Ytti,
> 
> We have introduced the scalable as-set into the XR route policy language.
> as-path-set does not scale well with 1000's of ASNs.
> Now, you don't need to expand AS-SET into prefix-set, just enter it directly.
> Example:
> as-set test
>   2914,
>   3356,
> end-set
> !
> route-policy sample
>   if as-path originates-from test then
> pass
>   endif
> end-policy
> 
> If this does not meet your needs and you need improvements, let me know.
> 
> Kind Regards,
> Jakob.
> 
> -
> Date: Mon, 9 Aug 2021 19:10:23 +0300
> From: Saku Ytti 
> 
> We just recently learned of a IOS-XR prefix-set limit of 31 when a
> particular customer AS-SET expanded to a higher number of prefixes.
> 
> -- 
>   ++ytti
> 

-- 
Tim Weippert
http://weiti.org - we...@weiti.org
GPG Fingerprint - E704 7303 6FF0 8393 ADB1  398E 67F2 94AE 5995 7DD8


RE: "Tactical" /24 announcements

2021-08-16 Thread Jakob Heitz (jheitz) via NANOG
Saku,

The feature is in 7.2.1. The documentation has not made it to the
command reference.

There is no expansion to prefix-set. The command checks the origin-AS
in the route. You should confirm the origin-AS with the prefix
using RPKI and/or another route-policy statement.
This way the final route-policy configuration will be much smaller.

I'm happy to answer more questions or requests for improvement
on or off list.

Regards,
Jakob.

-Original Message-
From: Saku Ytti  
Sent: Saturday, August 14, 2021 11:11 PM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org
Subject: Re: "Tactical" /24 announcements

Hey Jakob,

Is there documentation for this somewhere? Are you saying that the
IOS-XR host will connect to some (configured?) server to expand the
as-set, and at what time? Commit time? Once every N?

On Sun, 15 Aug 2021 at 04:50, Jakob Heitz (jheitz) via NANOG
 wrote:
>
> Ytti,
>
> We have introduced the scalable as-set into the XR route policy language.
> as-path-set does not scale well with 1000's of ASNs.
> Now, you don't need to expand AS-SET into prefix-set, just enter it directly.
> Example:
> as-set test
>   2914,
>   3356,
> end-set
> !
> route-policy sample
>   if as-path originates-from test then
> pass
>   endif
> end-policy
>
> If this does not meet your needs and you need improvements, let me know.
>
> Kind Regards,
> Jakob.
>
> -
> Date: Mon, 9 Aug 2021 19:10:23 +0300
> From: Saku Ytti 
>
> We just recently learned of a IOS-XR prefix-set limit of 31 when a
> particular customer AS-SET expanded to a higher number of prefixes.
>
> --
>   ++ytti
>


-- 
  ++ytti


Re: "Tactical" /24 announcements

2021-08-15 Thread Mark Tinka




On 8/15/21 08:11, Saku Ytti wrote:

Hey Jakob,

Is there documentation for this somewhere? Are you saying that the
IOS-XR host will connect to some (configured?) server to expand the
as-set, and at what time? Commit time? Once every N?


Yes, same question for me.

We've dumped all of our IOS XR eBGP-facing routers for Junos in the past 
3 years, but it would be something good to know to give us options for 
the future.


Mar.


Re: "Tactical" /24 announcements

2021-08-15 Thread Masataka Ohta

Jeff Tantsura wrote:

> where the routes computed would still be a
> subject to best route selection and hence
> reasonably safe wrt loops.

As Baldur said:

> For all the stub networks out there we should be able to
> aggressively filter routes without much harm.

thanks to IRRs and RPKI, whatever wrong things stub networks
might do, they only harm the stub networks and their peers
and are "reasonably safe" for rest of us.

So?

Masataka Ohta


Re: "Tactical" /24 announcements

2021-08-15 Thread Saku Ytti
Hey Jakob,

Is there documentation for this somewhere? Are you saying that the
IOS-XR host will connect to some (configured?) server to expand the
as-set, and at what time? Commit time? Once every N?

On Sun, 15 Aug 2021 at 04:50, Jakob Heitz (jheitz) via NANOG
 wrote:
>
> Ytti,
>
> We have introduced the scalable as-set into the XR route policy language.
> as-path-set does not scale well with 1000's of ASNs.
> Now, you don't need to expand AS-SET into prefix-set, just enter it directly.
> Example:
> as-set test
>   2914,
>   3356,
> end-set
> !
> route-policy sample
>   if as-path originates-from test then
> pass
>   endif
> end-policy
>
> If this does not meet your needs and you need improvements, let me know.
>
> Kind Regards,
> Jakob.
>
> -
> Date: Mon, 9 Aug 2021 19:10:23 +0300
> From: Saku Ytti 
>
> We just recently learned of a IOS-XR prefix-set limit of 31 when a
> particular customer AS-SET expanded to a higher number of prefixes.
>
> --
>   ++ytti
>


-- 
  ++ytti


RE:"Tactical" /24 announcements

2021-08-14 Thread Jakob Heitz (jheitz) via NANOG
Ytti,

We have introduced the scalable as-set into the XR route policy language.
as-path-set does not scale well with 1000's of ASNs.
Now, you don't need to expand AS-SET into prefix-set, just enter it directly.
Example:
as-set test
  2914,
  3356,
end-set
!
route-policy sample
  if as-path originates-from test then
pass
  endif
end-policy

If this does not meet your needs and you need improvements, let me know.

Kind Regards,
Jakob.

-
Date: Mon, 9 Aug 2021 19:10:23 +0300
From: Saku Ytti 

We just recently learned of a IOS-XR prefix-set limit of 31 when a
particular customer AS-SET expanded to a higher number of prefixes.

-- 
  ++ytti



Re: "Tactical" /24 announcements

2021-08-14 Thread Jeff Tantsura
Every major vendor at some point in time has implemented RIB->FIB(really 
BGP->RIB->FIB) filtering, on Redback/Ericsson routers we did around 
2013/2014(@Jakob Heitz;-))
Route compression is a more complex topic, it is not difficult to aggregate, it 
is to effectively disaggregate on changes.
MS research  published a white paper in early 2010s, Volta in late 2010s 
implemented quite effectively route aggregation on top of FRR BGP stack (full 
BGP table into Trident2 class silicon),  to my memory, Spotify did a similar 
implementation with Arista.

Most importantly - these days (at least Cisco and Juniper) through service 
layer APIs allow to run best path off box and reinject the results back into 
RIB, where the routes computed would still be a subject to best route selection 
and hence reasonably safe wrt loops.
So if you feel advantageous - write your own compression code, toolset is there.

Cheers,
Jeff

> On Aug 14, 2021, at 06:21, Masataka Ohta  
> wrote:
> 
> Baldur Norddahl wrote:
> 
>> For all the stub networks out there we should be able to aggressively
>> filter routes without much harm.
> 
> Stub networks, which, by definition, do not have transit traffic
> over them, can not filter routes for transit traffic at all.
> 
> But, if both of two stub networks with address ranges of
> 131.112.32.0/24 and 131.112.33.0/24 advertise 131.112.32.0/23,
> the result will be disastrous for the networks.
> 
> As such, even stub networks should advertise exact address
> ranges of them.
> 
>Masataka Ohta


Re: "Tactical" /24 announcements

2021-08-14 Thread Masataka Ohta

Baldur Norddahl wrote:


For all the stub networks out there we should be able to aggressively
filter routes without much harm.


Stub networks, which, by definition, do not have transit traffic
over them, can not filter routes for transit traffic at all.

But, if both of two stub networks with address ranges of
131.112.32.0/24 and 131.112.33.0/24 advertise 131.112.32.0/23,
the result will be disastrous for the networks.

As such, even stub networks should advertise exact address
ranges of them.

Masataka Ohta


Re: "Tactical" /24 announcements

2021-08-14 Thread Masataka Ohta

Tom Beecher wrote:


6.1.3 .

at the time of writing of this document, IPv4
prefixes longer than /24 and IPv6 prefixes longer than /48 are
generally neither announced nor accepted in the Internet


That's why, unlike IPv4, IPv6 is hopeless as rfc2374
was abandoned.

Masataka Ohta


Re: "Tactical" /24 announcements

2021-08-14 Thread Mark Tinka




On 8/12/21 19:57, Jon Lewis wrote:



Yeah...changes to the network could suddenly run such a box out of FIB 
resources, and you could easily be wrong when predicting how much 
longer a box has for it's "full routes" days...but the alternatives 
are "don't do full routes" or replace the box much sooner.
In that respect, it's somewhat remarkable that Arista even developed 
the feature.  "We can sell them newer hardware with larger FIB 
capabilities, or offer a software update that extends the life of the 
gear they've already bought."  What company chooses the latter? :)


There was a time when vendors were actually ran by engineers :-).

I recall asking for the feature from Cisco around 2011/2012, for the 
ME3600X/3800X, and that's how it arrived. The team developing that breed 
of box were excited about the prospect of its success, since I began 
working with them to develop it in 2009. So they rolled out as many 
features as I could help them make sense of, and BGP-SD was one of them.


The good news is it made it into IOS XE, and became available for a ton 
of other platforms, the ASR920 included.


Nowadays, one wonders who's actually running the show at vendor-land...

Mark.


Re: "Tactical" /24 announcements

2021-08-14 Thread Mark Tinka




On 8/12/21 19:30, Nick Hilliard wrote:




it also causes non-deterministic fib resource consumption. On most 
edge deployments this won't matter, but it wouldn't be hard to cook up 
a topology that could fail in interesting ways.  Overall fib 
compression is a net win, but you need to be careful with it.


We only needed it on boxes with a small FIB, to begin with. We don't use 
this on large routers with millions of FIB slots to spare.


The challenge on small boxes is that at some point, even with the FIB 
filtering, you will run out of FIB slots, and then weird things start 
happening.


At that point, dumping the box and going for something larger (say, 
dropping a 20,000 FIB box and going for a 256,000 FIB box) is your only 
real option. It's still cheaper than a large box with lots more FIB, and 
you can continue the benefits of FIB filtering without causing weird 
problems in the network.


Mark.


Re: "Tactical" /24 announcements

2021-08-13 Thread Mark Tinka




On 8/12/21 19:19, William Herrin wrote:


A originates 10.0.0.0/16 to paid transit C
B originates 10.0.1.0/24 also to paid transit C
C offers both routes to D. D discards 10.0.1.0/24 from the RIB based
on same-next-hop


Yeah, discarding from RIB is not the idea. It's discarding from FIB.

RIB is always globally converged.

Mark.


Re: "Tactical" /24 announcements

2021-08-13 Thread Mark Tinka



On 8/12/21 19:17, Amir Herzberg wrote:



Hi Hank, I think you're right, it could result in sub-optimal routing 
and in particular, in your AS not being used for these subprefixes 
(the traffic will go instead to a competing provider who sent the 
subprefix), hence, as you said, sub-optimal routing.


Incorrect - you hold the full table in RIB, which is what you offer to 
your downstream customers.


It's the FIB which wouldn't have the route, but would still be able to 
forward the traffic to a router that knows better.


Downstream customers are none the wiser.

Mark.


Re: "Tactical" /24 announcements

2021-08-13 Thread Mark Tinka




On 8/12/21 16:42, Tom Hill wrote:



I'm glad to hear a vendor has implemented a useful knob. Which vendor?


BGP-SD (Selective Download) from Cisco since about 2013.

I know both Juniper and Nokia have their versions as well.

It's nothing new.

Mark.


Re: "Tactical" /24 announcements

2021-08-13 Thread Baldur Norddahl
On Fri, Aug 13, 2021 at 10:53 PM Amir Herzberg  wrote:

>
> I think it isn't the same.
>

I am still not sure but maybe I misunderstood what you originally said. It
is probably not important.


> I think that the NANOG (or in general, operators) community may do well to
> state the `/24 rule' clearly in a BCP, preferably an RFC. A mismatch in the
> most-specific rule can definitely allow different problems (and attacks).
> As mentioned above, RIPE has essentially done this (although could be more
> explicit). I've seen a similar /48 rule for IPv6, btw.
>

I am not sure how big a problem this is. We only had this one case that I
described and it was easily fixed by allowing that one prefix from our
transit. The peer also offered to fix their announcement. But we did not
run with it for very long because we only reduced our routing table to
debug a different problem.

Maybe we could have a community or other mechanism to mark the few routes
that can not be dropped in exchange for a default route.

For all the stub networks out there we should be able to aggressively
filter routes without much harm.

Regards,

Baldur


Re: "Tactical" /24 announcements

2021-08-13 Thread Amir Herzberg
Tom, I also referred to the same text from 7454! But Baldur is right: the
text does NOT clearly state that announcement more specific than /24
should be filtered.

If you allow different operators to filter at different lengths, you can
get disconnections. We never like to standards to be fixed to some number,
but this seems necessary (to me).
-- 
Amir Herzberg

Comcast professor of Security Innovations, Computer Science and
Engineering, University of Connecticut
Homepage: https://sites.google.com/site/amirherzberg/home
`Applied Introduction to Cryptography' textbook and lectures:
 https://sites.google.com/site/amirherzberg/applied-crypto-textbook





On Fri, Aug 13, 2021 at 5:05 PM Tom Beecher  wrote:

> I think that the NANOG (or in general, operators) community may do well to
>> state the `/24 rule' clearly in a BCP, preferably an RFC.
>>
>
> https://datatracker.ietf.org/doc/html/rfc7454
>
> 6.1.3 .
>> Prefixes That Are Too Specific
>> Most ISPs will not accept advertisements beyond a certain level of
>> specificity (and in return, they do not announce prefixes they
>> consider to be too specific). That acceptable specificity is decided
>> for each peering between the two BGP peers. Some ISP communities
>> have tried to document acceptable specificity. This document does
>> not make any judgement on what the best approach is, it just notes
>> that there are existing practices on the Internet and recommends that
>> the reader refer to them. As an example, the RIPE community has
>> documented that, at the time of writing of this document, IPv4
>> prefixes longer than /24 and IPv6 prefixes longer than /48 are
>> generally neither announced nor accepted in the Internet [20
>> ] [21
>> ].
>>These values may change in the future.
>
>
> On Fri, Aug 13, 2021 at 4:54 PM Amir Herzberg 
> wrote:
>
>> On Fri, Aug 13, 2021 at 12:50 PM Baldur Norddahl <
>> baldur.nordd...@gmail.com> wrote:
>>
>>>
>>> On Fri, Aug 13, 2021 at 3:54 AM Amir Herzberg 
>>> wrote:
>>>
 On Thu, Aug 12, 2021 at 4:32 PM Baldur Norddahl <
 baldur.nordd...@gmail.com> wrote:

>
>
> On Thu, Aug 12, 2021 at 7:39 PM Amir Herzberg 
> wrote:
>
>> Bill, I beg to respectfully differ, knowing that I'm just a
>> researcher and working `for real' like you guys, so pls take no offence.
>>
>> I don't think A would be right to filter these packets to 10.0.1.0/24;
>> A has announced 10.0.0.0/16 so should route to that (entire) prefix,
>> or A is misleading its peers.
>>
>
> You are right that it is wrong but it happens. Some years back I tried
> a setup where we wanted to reduce the size of the routing table. We 
> dropped
> everything but routes received from peers and added a default to one of 
> our
> IP transit providers. This should have been ok because either we had a
> route to a peer or the packet would go to someone who had the full routing
> table, yes?
>

 Baldur, thanks, but, sorry, this isn't the same, or I miss something.

>>>
>>> I think it is exactly the same? Our peer is advertising a prefix for
>>> which they will not route all addresses covered. Is that route not then a
>>> lie? Should they not have exploded the prefix so they could avoid covering
>>> the part of the prefix they will not accept traffic to? (ps: not arguing
>>> they should!)
>>>
>>
>> I think it isn't the same. This scenario, of an AS selling part of its IP
>> block, e.g., 10.0.1.0/24, and continuing to announce the block, e.g.,
>> 10.0.0.0/16, is the classical example used (e.g. by me) to explain the
>> `most specific' rule. Due to `most specific', it is considered, imho, legit
>> to continue to announce 10.0.0.0/16; if 10.0.1.0/24 is reachable,
>> traffic will route to it anyway due to `more specific', so no problem; and
>> if 10.0.1.0/24 isn't reachable, then anyway no harm done...
>>
>> By dropping a legit 10.0.1.0/24 announcement, you may - and in the cited
>> example, did - break connectivity, imho. And quite unnecessarily, too.
>>
>>>
>>>
 If I get you right, you dropped all announcements from _providers_
 except making one provider your default gateway (essentially, 0.0.0.0/0).
 But this is very different from what I understood from what Bill wrote.
 Your change could (and, from what you say next, did) cause a problem if one
 of these announcements you dropped from providers was a legit subprefix of
 a prefix announced by one of your peers, causing you to route to the peer
 traffic whose destination is in the subprefix.

>>>
>>> Your understanding is correct. But this is just the way we ended up in
>>> that situation. Does not change the fact that we got a route from a peer
>>> that we 

Re: "Tactical" /24 announcements

2021-08-13 Thread Tom Beecher
>
> I think that the NANOG (or in general, operators) community may do well to
> state the `/24 rule' clearly in a BCP, preferably an RFC.
>

https://datatracker.ietf.org/doc/html/rfc7454

6.1.3 .
> Prefixes That Are Too Specific
> Most ISPs will not accept advertisements beyond a certain level of
> specificity (and in return, they do not announce prefixes they
> consider to be too specific). That acceptable specificity is decided
> for each peering between the two BGP peers. Some ISP communities
> have tried to document acceptable specificity. This document does
> not make any judgement on what the best approach is, it just notes
> that there are existing practices on the Internet and recommends that
> the reader refer to them. As an example, the RIPE community has
> documented that, at the time of writing of this document, IPv4
> prefixes longer than /24 and IPv6 prefixes longer than /48 are
> generally neither announced nor accepted in the Internet [20
> ] [21
> ].
>These values may change in the future.


On Fri, Aug 13, 2021 at 4:54 PM Amir Herzberg  wrote:

> On Fri, Aug 13, 2021 at 12:50 PM Baldur Norddahl <
> baldur.nordd...@gmail.com> wrote:
>
>>
>> On Fri, Aug 13, 2021 at 3:54 AM Amir Herzberg 
>> wrote:
>>
>>> On Thu, Aug 12, 2021 at 4:32 PM Baldur Norddahl <
>>> baldur.nordd...@gmail.com> wrote:
>>>


 On Thu, Aug 12, 2021 at 7:39 PM Amir Herzberg 
 wrote:

> Bill, I beg to respectfully differ, knowing that I'm just a researcher
> and working `for real' like you guys, so pls take no offence.
>
> I don't think A would be right to filter these packets to 10.0.1.0/24;
> A has announced 10.0.0.0/16 so should route to that (entire) prefix,
> or A is misleading its peers.
>

 You are right that it is wrong but it happens. Some years back I tried
 a setup where we wanted to reduce the size of the routing table. We dropped
 everything but routes received from peers and added a default to one of our
 IP transit providers. This should have been ok because either we had a
 route to a peer or the packet would go to someone who had the full routing
 table, yes?

>>>
>>> Baldur, thanks, but, sorry, this isn't the same, or I miss something.
>>>
>>
>> I think it is exactly the same? Our peer is advertising a prefix for
>> which they will not route all addresses covered. Is that route not then a
>> lie? Should they not have exploded the prefix so they could avoid covering
>> the part of the prefix they will not accept traffic to? (ps: not arguing
>> they should!)
>>
>
> I think it isn't the same. This scenario, of an AS selling part of its IP
> block, e.g., 10.0.1.0/24, and continuing to announce the block, e.g.,
> 10.0.0.0/16, is the classical example used (e.g. by me) to explain the
> `most specific' rule. Due to `most specific', it is considered, imho, legit
> to continue to announce 10.0.0.0/16; if 10.0.1.0/24 is reachable, traffic
> will route to it anyway due to `more specific', so no problem; and if
> 10.0.1.0/24 isn't reachable, then anyway no harm done...
>
> By dropping a legit 10.0.1.0/24 announcement, you may - and in the cited
> example, did - break connectivity, imho. And quite unnecessarily, too.
>
>>
>>
>>> If I get you right, you dropped all announcements from _providers_
>>> except making one provider your default gateway (essentially, 0.0.0.0/0).
>>> But this is very different from what I understood from what Bill wrote.
>>> Your change could (and, from what you say next, did) cause a problem if one
>>> of these announcements you dropped from providers was a legit subprefix of
>>> a prefix announced by one of your peers, causing you to route to the peer
>>> traffic whose destination is in the subprefix.
>>>
>>
>> Your understanding is correct. But this is just the way we ended up in
>> that situation. Does not change the fact that we got a route from a peer
>> that we believed we could use, but turns out part of that announcement was
>> a lie.
>>
>
> Was not a lie, as I explained.
>
>>
>> Consider that everyone filters received routes. The most common is to
>> filter at the /24 level but nowhere is there a RFC stating that /24 is
>> anything special.
>>
>
> Oh that's a point I was quite annoyed with for years - who said one should
> drop prefixes longer than /24 ??? And I searched for it, and indeed found
> no RFC. But I did find it mentioned in some BCPs!
> Unfortunately and stupidly, I didn't save these sources, but I did a quick
> google now and found
>
>
> https://nsrc.org/workshops/2018/linx103-bgp/networking/peering-ixp/en/presentations/05-BGP-BCP.pdf
>
> But that was years ago, and indeed, I also found mention in RFC 7454:
>
>
>> 6.1.3 .  Prefixes 
>> That Are Too Specific
>>
>> 

Re: "Tactical" /24 announcements

2021-08-13 Thread Amir Herzberg
On Fri, Aug 13, 2021 at 12:50 PM Baldur Norddahl 
wrote:

>
> On Fri, Aug 13, 2021 at 3:54 AM Amir Herzberg 
> wrote:
>
>> On Thu, Aug 12, 2021 at 4:32 PM Baldur Norddahl <
>> baldur.nordd...@gmail.com> wrote:
>>
>>>
>>>
>>> On Thu, Aug 12, 2021 at 7:39 PM Amir Herzberg 
>>> wrote:
>>>
 Bill, I beg to respectfully differ, knowing that I'm just a researcher
 and working `for real' like you guys, so pls take no offence.

 I don't think A would be right to filter these packets to 10.0.1.0/24;
 A has announced 10.0.0.0/16 so should route to that (entire) prefix,
 or A is misleading its peers.

>>>
>>> You are right that it is wrong but it happens. Some years back I tried a
>>> setup where we wanted to reduce the size of the routing table. We dropped
>>> everything but routes received from peers and added a default to one of our
>>> IP transit providers. This should have been ok because either we had a
>>> route to a peer or the packet would go to someone who had the full routing
>>> table, yes?
>>>
>>
>> Baldur, thanks, but, sorry, this isn't the same, or I miss something.
>>
>
> I think it is exactly the same? Our peer is advertising a prefix for which
> they will not route all addresses covered. Is that route not then a lie?
> Should they not have exploded the prefix so they could avoid covering the
> part of the prefix they will not accept traffic to? (ps: not arguing they
> should!)
>

I think it isn't the same. This scenario, of an AS selling part of its IP
block, e.g., 10.0.1.0/24, and continuing to announce the block, e.g.,
10.0.0.0/16, is the classical example used (e.g. by me) to explain the
`most specific' rule. Due to `most specific', it is considered, imho, legit
to continue to announce 10.0.0.0/16; if 10.0.1.0/24 is reachable, traffic
will route to it anyway due to `more specific', so no problem; and if
10.0.1.0/24 isn't reachable, then anyway no harm done...

By dropping a legit 10.0.1.0/24 announcement, you may - and in the cited
example, did - break connectivity, imho. And quite unnecessarily, too.

>
>
>> If I get you right, you dropped all announcements from _providers_ except
>> making one provider your default gateway (essentially, 0.0.0.0/0). But
>> this is very different from what I understood from what Bill wrote. Your
>> change could (and, from what you say next, did) cause a problem if one of
>> these announcements you dropped from providers was a legit subprefix of a
>> prefix announced by one of your peers, causing you to route to the peer
>> traffic whose destination is in the subprefix.
>>
>
> Your understanding is correct. But this is just the way we ended up in
> that situation. Does not change the fact that we got a route from a peer
> that we believed we could use, but turns out part of that announcement was
> a lie.
>

Was not a lie, as I explained.

>
> Consider that everyone filters received routes. The most common is to
> filter at the /24 level but nowhere is there a RFC stating that /24 is
> anything special.
>

Oh that's a point I was quite annoyed with for years - who said one should
drop prefixes longer than /24 ??? And I searched for it, and indeed found
no RFC. But I did find it mentioned in some BCPs!
Unfortunately and stupidly, I didn't save these sources, but I did a quick
google now and found

https://nsrc.org/workshops/2018/linx103-bgp/networking/peering-ixp/en/presentations/05-BGP-BCP.pdf

But that was years ago, and indeed, I also found mention in RFC 7454:


> 6.1.3 .  Prefixes 
> That Are Too Specific
>
>Most ISPs will not accept advertisements beyond a certain level of
>specificity (and in return, they do not announce prefixes they
>consider to be too specific).  That acceptable specificity is decided
>for each peering between the two BGP peers.  Some ISP communities
>have tried to document acceptable specificity.  This document does
>not make any judgement on what the best approach is, it just notes
>that there are existing practices on the Internet and recommends that
>the reader refer to them.  As an example, the RIPE community has
>documented that, at the time of writing of this document, IPv4
>prefixes longer than /24 and IPv6 prefixes longer than /48 are
>generally neither announced nor accepted in the Internet [20 
> ] [21 
> ].
>These values may change in the future.
>
>
I also did an experiment that seemed to confirm that most ISPs filter
announcements more specific than /24.

I think that the NANOG (or in general, operators) community may do well to
state the `/24 rule' clearly in a BCP, preferably an RFC. A mismatch in the
most-specific rule can definitely allow different problems (and attacks).
As mentioned above, RIPE has essentially done this (although could be more
explicit). I've seen a similar /48 rule 

Re: "Tactical" /24 announcements

2021-08-13 Thread William Herrin
On Fri, Aug 13, 2021 at 9:49 AM Baldur Norddahl
 wrote:
> Our peer is advertising a prefix for which they will not route
> all addresses covered. Is that route not then a lie? Should
> they not have exploded the prefix so they could avoid covering
> the part of the prefix they will not accept traffic to? (ps: not arguing they 
> should!)

Hi Baldur,

You do understand the consequence of the position you're taking?
You're saying that when an ISP provides a /24 to a customer for
multihoming, a common practice throughout the history of the
commercial Internet, that ISP MUST also disaggregate the announcement
for the supernet that /24 is a part of, exploding the size of the BGP
table. If they don't, the overlapping announcement is a "lie" because
they don't always have a route to the /24.

Regards,
Bill Herrin



-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: "Tactical" /24 announcements

2021-08-13 Thread Baldur Norddahl
On Fri, Aug 13, 2021 at 3:54 AM Amir Herzberg  wrote:

> On Thu, Aug 12, 2021 at 4:32 PM Baldur Norddahl 
> wrote:
>
>>
>>
>> On Thu, Aug 12, 2021 at 7:39 PM Amir Herzberg 
>> wrote:
>>
>>> Bill, I beg to respectfully differ, knowing that I'm just a researcher
>>> and working `for real' like you guys, so pls take no offence.
>>>
>>> I don't think A would be right to filter these packets to 10.0.1.0/24;
>>> A has announced 10.0.0.0/16 so should route to that (entire) prefix, or
>>> A is misleading its peers.
>>>
>>
>> You are right that it is wrong but it happens. Some years back I tried a
>> setup where we wanted to reduce the size of the routing table. We dropped
>> everything but routes received from peers and added a default to one of our
>> IP transit providers. This should have been ok because either we had a
>> route to a peer or the packet would go to someone who had the full routing
>> table, yes?
>>
>
> Baldur, thanks, but, sorry, this isn't the same, or I miss something.
>

I think it is exactly the same? Our peer is advertising a prefix for which
they will not route all addresses covered. Is that route not then a lie?
Should they not have exploded the prefix so they could avoid covering the
part of the prefix they will not accept traffic to? (ps: not arguing they
should!)


> If I get you right, you dropped all announcements from _providers_ except
> making one provider your default gateway (essentially, 0.0.0.0/0). But
> this is very different from what I understood from what Bill wrote. Your
> change could (and, from what you say next, did) cause a problem if one of
> these announcements you dropped from providers was a legit subprefix of a
> prefix announced by one of your peers, causing you to route to the peer
> traffic whose destination is in the subprefix.
>

Your understanding is correct. But this is just the way we ended up in that
situation. Does not change the fact that we got a route from a peer that we
believed we could use, but turns out part of that announcement was a lie.

Consider that everyone filters received routes. The most common is to
filter at the /24 level but nowhere is there a RFC stating that /24 is
anything special. So what if I was to filter at a different level, say /20
? The same thing would happen, we would drop the "/24 exception route" and
use the route that is a lie.

Regards,

Baldur


Re: "Tactical" /24 announcements

2021-08-13 Thread Sabri Berisha
- On Aug 12, 2021, at 10:38 AM, Amir Herzberg amir.li...@gmail.com wrote:

Hi,

> I don't think A would be right to filter these packets to 10.0.1.0/24; A has 
> announced
> 10.0.0.0/16 so should route to that (entire) prefix, or A is misleading its 
> peers.

This is what it boils down to. If you don't want to route it, don't advertise 
it.

Thanks,

Sabri


Re: "Tactical" /24 announcements

2021-08-12 Thread Amir Herzberg
On Thu, Aug 12, 2021 at 4:32 PM Baldur Norddahl 
wrote:

>
>
> On Thu, Aug 12, 2021 at 7:39 PM Amir Herzberg 
> wrote:
>
>> Bill, I beg to respectfully differ, knowing that I'm just a researcher
>> and working `for real' like you guys, so pls take no offence.
>>
>> I don't think A would be right to filter these packets to 10.0.1.0/24; A
>> has announced 10.0.0.0/16 so should route to that (entire) prefix, or A
>> is misleading its peers.
>>
>
> You are right that it is wrong but it happens. Some years back I tried a
> setup where we wanted to reduce the size of the routing table. We dropped
> everything but routes received from peers and added a default to one of our
> IP transit providers. This should have been ok because either we had a
> route to a peer or the packet would go to someone who had the full routing
> table, yes?
>

Baldur, thanks, but, sorry, this isn't the same, or I miss something. If I
get you right, you dropped all announcements from _providers_ except making
one provider your default gateway (essentially, 0.0.0.0/0). But this is
very different from what I understood from what Bill wrote. Your change
could (and, from what you say next, did) cause a problem if one of these
announcements you dropped from providers was a legit subprefix of a prefix
announced by one of your peers, causing you to route to the peer traffic
whose destination is in the subprefix. But let me be concrete using what
you wrote:

>
> So we got complaints. One was a company who would advertise a /20 on a
> peering with us. But somewhere else far away they had a site from where
> they would announce a /24 from the same prefix. With no internal routing
> between the peering site with the /20 to the other site with the /24. We
> therefore lost the ability to communicate with that /24.
>

exactly; but this is since you incorrectly dropped the subprefix
announcement which you evidently received from one of your providers.

If this analysis is correct, you could have solved the problem - reducing
the FIB while avoiding such loss of connectivity - if you retained (only)
the announcements from your providers which were to subprefixes of
announcements you got from your peers. A bit of scripting required, of
course... I'm sure you can do it 100 times faster and better than me :)

>
> You see variants of this. For example a large telco has a /16 from which
> they many years ago allocated a /24 to a multihomed customer. This customer
> left but took their /24 with them. This fact will seldom make the large
> telco split up their /16. They will keep it as a /16 but will no longer
> route to that /24. The question is also if we really would want a large
> telco to explode a large subnet due to this case.
>

No way, agreed!

But, as I explained, it's also unnecessary; I mean, that's exactly why we
do `most specific' routing. Just don't kill the subprefix announcement!

btw... yes, this is a possible issue with ROV, when sometimes there's a ROA
for a prefix (say /16) but no roa to a (legitimately announced) subprefix
(e.g. /20). We show such case in our 2015 ROV paper, and also measured how
many such issues exist; it appears their number is much reduced now, based
on more recent measurements. (ah and here, our ROV++ doesn't help; in fact,
it would disconnection even more likely than with ROV, since ROV protection
against subprefix hijacks is rather weak).

Regards,
--
Amir Herzberg

Comcast professor of Security Innovations, Computer Science and
Engineering, University of Connecticut
Homepage: https://sites.google.com/site/amirherzberg/home
`Applied Introduction to Cryptography' textbook and lectures:
 https://sites.google.com/site/amirherzberg/applied-crypto-textbook


>


Re: "Tactical" /24 announcements

2021-08-12 Thread Baldur Norddahl
On Thu, Aug 12, 2021 at 7:39 PM Amir Herzberg  wrote:

> Bill, I beg to respectfully differ, knowing that I'm just a researcher and
> working `for real' like you guys, so pls take no offence.
>
> I don't think A would be right to filter these packets to 10.0.1.0/24; A
> has announced 10.0.0.0/16 so should route to that (entire) prefix, or A
> is misleading its peers.
>

You are right that it is wrong but it happens. Some years back I tried a
setup where we wanted to reduce the size of the routing table. We dropped
everything but routes received from peers and added a default to one of our
IP transit providers. This should have been ok because either we had a
route to a peer or the packet would go to someone who had the full routing
table, yes?

So we got complaints. One was a company who would advertise a /20 on a
peering with us. But somewhere else far away they had a site from where
they would announce a /24 from the same prefix. With no internal routing
between the peering site with the /20 to the other site with the /24. We
therefore lost the ability to communicate with that /24.

You see variants of this. For example a large telco has a /16 from which
they many years ago allocated a /24 to a multihomed customer. This customer
left but took their /24 with them. This fact will seldom make the large
telco split up their /16. They will keep it as a /16 but will no longer
route to that /24. The question is also if we really would want a large
telco to explode a large subnet due to this case.

Regards,

Baldur


Re: "Tactical" /24 announcements

2021-08-12 Thread Jon Lewis

On Thu, 12 Aug 2021, William Herrin wrote:


On Thu, Aug 12, 2021 at 9:41 AM Hank Nussbacher  wrote:

On 12/08/2021 17:59, William Herrin wrote:

If you prune the routes from the Routing Information Base instead, for
any widely accepted size (i.e. /24 or shorter netmask) you break the
Internet.


How does this break the Internet?  I would think it would just result in
sub-optimal routing (provided there is a covering larger prefix) but
everything should continue to work.  Clue me in, please.


A originates 10.0.0.0/16 to paid transit C
B originates 10.0.1.0/24 also to paid transit C
C offers both routes to D. D discards 10.0.1.0/24 from the RIB based
on same-next-hop
You peer with A and D. You receive only 10.0.0.0/16 since A doesn't
originate 10.0.1.0/24 and D has discarded it.
You send packets for 10.0.1.0/24 to A (the shortest path for
10.0.0.0/16), stealing A's paid transit to C to get to B.
Unless A filters C-bound packets purportedly from 10.0.1.0/24. B
doesn't currently transit for A so from B's perspective that's not an
allowed path. In which case, your path to 10.0.1.0/24 is black holed.

D broke the Internet. If packets from you reach A at all, they do so
through an unpermitted path.


A originated the /16 and should be prepared to deal with all bits to IPs 
within it.


What's worse is when A originates/advertises the /16 to C.  A also 
advertises the /24(s) only to other transits D, E, and F.  C's peers that 
don't see the subnets send traffic to C that C then has to send out via 
transit to reach D, E, or F.  I've been C :(  We asked A to make it stop.



--
 Jon Lewis, MCP :)   |  I route
 StackPath, Sr. Neteng   |  therefore you are
_ http://www.lewis.org/~jlewis/pgp for PGP public key_


Re: "Tactical" /24 announcements

2021-08-12 Thread William Herrin
On Thu, Aug 12, 2021 at 10:39 AM Amir Herzberg  wrote:
> On Thu, Aug 12, 2021 at 1:22 PM William Herrin  wrote:
>> A originates 10.0.0.0/16 to paid transit C
>> B originates 10.0.1.0/24 also to paid transit C

> Bill, I beg to respectfully differ, knowing that I'm just a researcher and 
> working `for real' like you guys, so pls take no offence.

Hi Amir,

Why would I take offense? How do any of us learn except by trying to
poke holes in claims to see what holds up and what doesn't?


> I don't think A would be right to filter these packets to 10.0.1.0/24; A has 
> announced 10.0.0.0/16 so should route to that (entire) prefix, or A is 
> misleading its peers.

The alternative is that A has to disaggregate 10.0.0.0/16 into at
least 8 prefixes on the -possibility- that some jackass might filter
the one /24 that B announces. If trying to filter one route results in
7 extra routes being added to the table, that's net badness.

Filtering may not even be intentional on A's part. If A's peering
router only receives A's customer-originated routes (a common enough
architecture) then the peering router won't even have a route to B
while B's route only arrives from C.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: "Tactical" /24 announcements

2021-08-12 Thread Jon Lewis

On Thu, 12 Aug 2021, Nick Hilliard wrote:


Jon Lewis wrote on 12/08/2021 18:09:

 Arista.  They call it FIB compression.  They mention it's a trade-off,
 more memory and CPU utilization (keeping track of things) in exchange for
 being able to keep hardware that might otherwise be out of FIB space able
 to cope with full tables.


it also causes non-deterministic fib resource consumption. On most edge 
deployments this won't matter, but it wouldn't be hard to cook up a topology 
that could fail in interesting ways.  Overall fib compression is a net win, 
but you need to be careful with it.


Yeah...changes to the network could suddenly run such a box out of FIB 
resources, and you could easily be wrong when predicting how much longer a 
box has for it's "full routes" days...but the alternatives are "don't do 
full routes" or replace the box much sooner.
In that respect, it's somewhat remarkable that Arista even developed the 
feature.  "We can sell them newer hardware with larger FIB capabilities, 
or offer a software update that extends the life of the gear they've 
already bought."  What company chooses the latter? :)


--
 Jon Lewis, MCP :)   |  I route
 StackPath, Sr. Neteng   |  therefore you are
_ http://www.lewis.org/~jlewis/pgp for PGP public key_


Re: "Tactical" /24 announcements

2021-08-12 Thread Amir Herzberg
On Thu, Aug 12, 2021 at 1:22 PM William Herrin  wrote:

> On Thu, Aug 12, 2021 at 9:41 AM Hank Nussbacher 
> wrote:
> > On 12/08/2021 17:59, William Herrin wrote:
> > > If you prune the routes from the Routing Information Base instead, for
> > > any widely accepted size (i.e. /24 or shorter netmask) you break the
> > > Internet.
> >
> > How does this break the Internet?  I would think it would just result in
> > sub-optimal routing (provided there is a covering larger prefix) but
> > everything should continue to work.  Clue me in, please.
>
> A originates 10.0.0.0/16 to paid transit C
> B originates 10.0.1.0/24 also to paid transit C
> C offers both routes to D. D discards 10.0.1.0/24 from the RIB based
> on same-next-hop
> You peer with A and D. You receive only 10.0.0.0/16 since A doesn't
> originate 10.0.1.0/24 and D has discarded it.
> You send packets for 10.0.1.0/24 to A (the shortest path for
> 10.0.0.0/16), stealing A's paid transit to C to get to B.

Unless A filters C-bound packets purportedly from 10.0.1.0/24. B
> doesn't currently transit for A so from B's perspective that's not an
> allowed path. In which case, your path to 10.0.1.0/24 is black holed.
>

Bill, I beg to respectfully differ, knowing that I'm just a researcher and
working `for real' like you guys, so pls take no offence.

I don't think A would be right to filter these packets to 10.0.1.0/24; A
has announced 10.0.0.0/16 so should route to that (entire) prefix, or A is
misleading its peers.

If A doesn't, though, then B receives a packet from A to 10.0.1.0/24.
Unless B is filtering based on the specific IP prefixes of A - which seems
to me unlikely - then B has no way to know that this packet is from `you'
rather than from A itself (or a customer of A). So B will carry this
traffic, imho.

So A is just paying for the traffic since it announced the prefix.

Such situations, to best of my knowledge, actually happen on the Internet
when a subprefix is filtered for different reasons. We observed it happens
with ROV , in our ROV++ simulations, but I'll refrain
from attaching the URL again so not to be `plugging' that paper (and since
I'm lazy to look it up hhh)

have great day and I'll be happy to learn if I'm wrong. Amir


Re: "Tactical" /24 announcements

2021-08-12 Thread Nick Hilliard

Jon Lewis wrote on 12/08/2021 18:09:
Arista.  They call it FIB compression.  They mention it's a trade-off, 
more memory and CPU utilization (keeping track of things) in exchange 
for being able to keep hardware that might otherwise be out of FIB space 
able to cope with full tables.


it also causes non-deterministic fib resource consumption. On most edge 
deployments this won't matter, but it wouldn't be hard to cook up a 
topology that could fail in interesting ways.  Overall fib compression 
is a net win, but you need to be careful with it.


Nick


Re: "Tactical" /24 announcements

2021-08-12 Thread William Herrin
On Thu, Aug 12, 2021 at 10:19 AM William Herrin  wrote:
> On Thu, Aug 12, 2021 at 9:41 AM Hank Nussbacher  wrote:
> > On 12/08/2021 17:59, William Herrin wrote:
> > > If you prune the routes from the Routing Information Base instead, for
> > > any widely accepted size (i.e. /24 or shorter netmask) you break the
> > > Internet.
> >
> > How does this break the Internet?  I would think it would just result in
> > sub-optimal routing (provided there is a covering larger prefix) but
> > everything should continue to work.  Clue me in, please.
>
> A originates 10.0.0.0/16 to paid transit C
> B originates 10.0.1.0/24 also to paid transit C
> C offers both routes to D. D discards 10.0.1.0/24 from the RIB based
> on same-next-hop
> You peer with A and D. You receive only 10.0.0.0/16 since A doesn't
> originate 10.0.1.0/24 and D has discarded it.
> You send packets for 10.0.1.0/24 to A (the shortest path for
> 10.0.0.0/16), stealing A's paid transit to C to get to B.
>Unless A filters C-bound packets purportedly from 10.0.1.0/24.

I mashed this sentence together wrong. I meant say: "Unless A filters
packets from peers which would use their paid transit," a common
policy restriction placed on settlement-free peering.

>B
> doesn't currently transit for A so from B's perspective that's not an
> allowed path. In which case, your path to 10.0.1.0/24 is black holed.
>
> D broke the Internet. If packets from you reach A at all, they do so
> through an unpermitted path.



-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: "Tactical" /24 announcements

2021-08-12 Thread Tom Hill
On 12/08/2021 18:09, Jon Lewis wrote:
>> 
>> Having an upstream provider that did it, in a very aggressive
>> fashion.
> 
> Odds are, they did it wrong, and you had no control and limited, if
> any, visibility into what they did.  Obviously, if you're going to
> blindly filter routes based on prefix-length, you need to point
> default at something that doesn't...and if you're acting as a transit
> provider, you're likely no longer able to provide "full routes" to
> customers from devices doing this or fed the "not so full table" from
> devices doing it.


Yes. This is precisely why I wrote my initial email, and perhaps I
wasn't specific enough, but it was a fairly generic warning against
"bright ideas" that don't include the proper scrutiny (or _do_ include
unnecessary amounts of arrogance).


> Arista.  They call it FIB compression.  They mention it's a
> trade-off, more memory and CPU utilization (keeping track of things)
> in exchange for being able to keep hardware that might otherwise be
> out of FIB space able to cope with full tables.

Ah, thank you, noted.

-- 
Tom


Re: "Tactical" /24 announcements

2021-08-12 Thread William Herrin
On Thu, Aug 12, 2021 at 9:41 AM Hank Nussbacher  wrote:
> On 12/08/2021 17:59, William Herrin wrote:
> > If you prune the routes from the Routing Information Base instead, for
> > any widely accepted size (i.e. /24 or shorter netmask) you break the
> > Internet.
>
> How does this break the Internet?  I would think it would just result in
> sub-optimal routing (provided there is a covering larger prefix) but
> everything should continue to work.  Clue me in, please.

A originates 10.0.0.0/16 to paid transit C
B originates 10.0.1.0/24 also to paid transit C
C offers both routes to D. D discards 10.0.1.0/24 from the RIB based
on same-next-hop
You peer with A and D. You receive only 10.0.0.0/16 since A doesn't
originate 10.0.1.0/24 and D has discarded it.
You send packets for 10.0.1.0/24 to A (the shortest path for
10.0.0.0/16), stealing A's paid transit to C to get to B.
Unless A filters C-bound packets purportedly from 10.0.1.0/24. B
doesn't currently transit for A so from B's perspective that's not an
allowed path. In which case, your path to 10.0.1.0/24 is black holed.

D broke the Internet. If packets from you reach A at all, they do so
through an unpermitted path.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: "Tactical" /24 announcements

2021-08-12 Thread Amir Herzberg
On Thu, Aug 12, 2021 at 12:43 PM Hank Nussbacher 
wrote:

> On 12/08/2021 17:59, William Herrin wrote:
>
> > If you prune the routes from the Routing Information Base instead, for
> > any widely accepted size (i.e. /24 or shorter netmask) you break the
> > Internet.
>
> How does this break the Internet?  I would think it would just result in
> sub-optimal routing (provided there is a covering larger prefix) but
> everything should continue to work.  Clue me in, please.
>

Hi Hank, I think you're right, it could result in sub-optimal routing and
in particular, in your AS not being used for these subprefixes (the traffic
will go instead to a competing provider who sent the subprefix), hence, as
you said, sub-optimal routing.

I think some people (maybe Bill included) may consider the resulting harm
to routing to be sufficiently severe to consider this `breaking'. It
becomes a judgement call, I guess.

Cheers, Amir

>
> -Hank
>
>


Re: "Tactical" /24 announcements

2021-08-12 Thread Jon Lewis

On Thu, 12 Aug 2021, Tom Hill wrote:


On 11/08/2021 14:09, Jon Lewis wrote:

What sort of hands-on experience is this opinion based on?


Having an upstream provider that did it, in a very aggressive fashion.


Odds are, they did it wrong, and you had no control and limited, if any, 
visibility into what they did.  Obviously, if you're going to blindly 
filter routes based on prefix-length, you need to point default at 
something that doesn't...and if you're acting as a transit provider, 
you're likely no longer able to provide "full routes" to customers from 
devices doing this or fed the "not so full table" from devices doing it. 
I can work though, and on pretty much any platform.



Limiting the pruning to cases with the same next-hop does indeed sound
like it would be safer than what I've seen done in the past.

Doing this with per-peer prefix-lists would not (certainly not in
classic IOS) provide you with the ability to compare the next-hop before
rejecting those more-specific prefixes, and likely why the attempts to
do this caused the problems that I'm referring to.

I'm glad to hear a vendor has implemented a useful knob. Which vendor?


Arista.  They call it FIB compression.  They mention it's a trade-off, 
more memory and CPU utilization (keeping track of things) in exchange for 
being able to keep hardware that might otherwise be out of FIB space able 
to cope with full tables.


--
 Jon Lewis, MCP :)   |  I route
 StackPath, Sr. Neteng   |  therefore you are
_ http://www.lewis.org/~jlewis/pgp for PGP public key_


Re: "Tactical" /24 announcements

2021-08-12 Thread Hank Nussbacher

On 12/08/2021 17:59, William Herrin wrote:


If you prune the routes from the Routing Information Base instead, for
any widely accepted size (i.e. /24 or shorter netmask) you break the
Internet. 


How does this break the Internet?  I would think it would just result in 
sub-optimal routing (provided there is a covering larger prefix) but 
everything should continue to work.  Clue me in, please.


-Hank



Re: "Tactical" /24 announcements

2021-08-12 Thread William Herrin
On Thu, Aug 12, 2021 at 7:44 AM Tom Hill  wrote:
> On 11/08/2021 14:09, Jon Lewis wrote:
> > At least one major network hardware vendor has implemented it as a
> > feature.  Turn it on, and the "deaggregates" with same next-hop as an
> > aggregate are not programmed into the FIB.  The savings will vary
> > depending on the device's connectivity, but I've seen >40%.
>
> Limiting the pruning to cases with the same next-hop does indeed sound
> like it would be safer than what I've seen done in the past.

Hi Tom,

To be clear, Jon was talking about pruning it from the FIB not the
RIB. You can always safely prune overlapping routes with the same next
hop from the Forwarding Information Base because the FIB lookup will
still select the same next hop regardless. This is valuable because
the main cost driver is carrying the routes in the FIB table that's
consulted for every packet handled.

If you prune the routes from the Routing Information Base instead, for
any widely accepted size (i.e. /24 or shorter netmask) you break the
Internet. Just because it's the same next hop for you doesn't mean the
routes actually share fate. The routers past you need both routes to
figure out their own position in a valid path. And it doesn't save you
much anyway because the RIB is only consulted when routes change and
need to be reprocessed into FIB entries. A $10 virtual server can
handle today's BGP RIB with ease and equipment with only a little more
power could handle one much larger. It's the FIB which drives the
limits.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: "Tactical" /24 announcements

2021-08-12 Thread Tom Hill
On 11/08/2021 14:09, Jon Lewis wrote:
> What sort of hands-on experience is this opinion based on?

Having an upstream provider that did it, in a very aggressive fashion.


> I've done this manually in the past (quite some time ago), and done
> properly, it works fine.
> 
> At least one major network hardware vendor has implemented it as a
> feature.  Turn it on, and the "deaggregates" with same next-hop as an
> aggregate are not programmed into the FIB.  The savings will vary
> depending on the device's connectivity, but I've seen >40%.


Limiting the pruning to cases with the same next-hop does indeed sound
like it would be safer than what I've seen done in the past.

Doing this with per-peer prefix-lists would not (certainly not in
classic IOS) provide you with the ability to compare the next-hop before
rejecting those more-specific prefixes, and likely why the attempts to
do this caused the problems that I'm referring to.

I'm glad to hear a vendor has implemented a useful knob. Which vendor?

-- 
Tom


Re: "Tactical" /24 announcements

2021-08-11 Thread Mark Tinka




On 8/11/21 12:24, Tom Hill wrote:


Such anti-disaggregation/save-my-TCAM efforts really do not work, and
will spawn all manner of support tickets. I'm saying this in the hope
that it may prevent someone from reading this thread and concluding that
it may be a good idea to try. It is not.


We've been doing this on low-FIB platforms (mainly for our Metro, where 
we want to hold a full table in RIB and a few internal routes in RIB) 
since about 2014, and it works - as Scar in "The Lion King" would say - 
rather well.


The only downside is when your IGP and LDP database grows larger than 
the available FIB. But that's another issue; although not an 
insignificant one.


Mark.


Re: "Tactical" /24 announcements

2021-08-11 Thread Jon Lewis

On Wed, 11 Aug 2021, Tom Hill wrote:


On 10/08/2021 07:15, Lukas Tribus wrote:

Are there any big networks that drop or penalize announcements like this?

It's possible you could get your peering request denied for this. I
have put *reasonable* prefix aggregation into peering requirements for
some years now. If you are a small eyeball network with 8192 IP
addresses and originate 32 /24's, that is *not* reasonable.


It is quite an issue when a network tries to programmatically filter-out
the /24 more-specifics advertisements made from an allocated, .e.g, /20.

Such anti-disaggregation/save-my-TCAM efforts really do not work, and
will spawn all manner of support tickets. I'm saying this in the hope
that it may prevent someone from reading this thread and concluding that
it may be a good idea to try. It is not.


What sort of hands-on experience is this opinion based on?

I've done this manually in the past (quite some time ago), and done 
properly, it works fine.


At least one major network hardware vendor has implemented it as a 
feature.  Turn it on, and the "deaggregates" with same next-hop as an 
aggregate are not programmed into the FIB.  The savings will vary 
depending on the device's connectivity, but I've seen >40%.



--
 Jon Lewis, MCP :)   |  I route
 StackPath, Sr. Neteng   |  therefore you are
_ http://www.lewis.org/~jlewis/pgp for PGP public key_


Re: "Tactical" /24 announcements

2021-08-11 Thread Lukas Tribus
On Wed, 11 Aug 2021 at 12:24, Tom Hill  wrote:
>
> On 10/08/2021 07:15, Lukas Tribus wrote:
> >> Are there any big networks that drop or penalize announcements like this?
> > It's possible you could get your peering request denied for this. I
> > have put *reasonable* prefix aggregation into peering requirements for
> > some years now. If you are a small eyeball network with 8192 IP
> > addresses and originate 32 /24's, that is *not* reasonable.
>
> It is quite an issue when a network tries to programmatically filter-out
> the /24 more-specifics advertisements made from an allocated, .e.g, /20.
>
> Such anti-disaggregation/save-my-TCAM efforts really do not work, and
> will spawn all manner of support tickets. I'm saying this in the hope
> that it may prevent someone from reading this thread and concluding that
> it may be a good idea to try. It is not.

For the record: I did not suggest anything like this.

Denying peering requests due to lack of *reasonable* prefix
aggregation does not mean installing fancy, impossibile to maintain
prefix-lists on transit ingress. I agree with you here, that would be
very bad.

This save-my-TCAM effort is successful when the peer on the other site
actually realizes that there are consequences to decisions like this
and reverts it, which is a long shot, sure, but at least I'm not
encouraging this. I don't get to dictate other peoples configurations.
I do get to decide who is directly exchanging traffic with my network
and who isn't.


lukas


Re: "Tactical" /24 announcements

2021-08-11 Thread Mark Tinka




On 8/11/21 12:07, Tom Hill wrote:


2914 permit you to leak prefixes as specific as a /28 between your own
ports with them. Someone once referred to it as a 'sneaky backhaul',
believe. Given that there's no default in 2914, I guess that counts? :D


I suppose some arrangement between you and your provider is alright. But 
since I (and I'm guessing, several other) operators that purchase from 
NTT will not accept anything longer than a /24 from them, it only serves 
internal forwarding within NTT for their customers that leak /25's or 
longer into them.


I was wondering why we haven't seen this take off in the global DFZ :-).

Mark.


Re: "Tactical" /24 announcements

2021-08-11 Thread Tom Hill
On 10/08/2021 07:15, Lukas Tribus wrote:
>> Are there any big networks that drop or penalize announcements like this?
> It's possible you could get your peering request denied for this. I
> have put *reasonable* prefix aggregation into peering requirements for
> some years now. If you are a small eyeball network with 8192 IP
> addresses and originate 32 /24's, that is *not* reasonable.

It is quite an issue when a network tries to programmatically filter-out
the /24 more-specifics advertisements made from an allocated, .e.g, /20.

Such anti-disaggregation/save-my-TCAM efforts really do not work, and
will spawn all manner of support tickets. I'm saying this in the hope
that it may prevent someone from reading this thread and concluding that
it may be a good idea to try. It is not.

Speaking to your peers is good, as I think you're encouraging there. I
would of course default to asking them if they've read from the Good
Book of RPKI. :)

I also often find that very outdated "Good Security Practice" is as much
to blame for this as anything else, and so when we do talk to our peers
and/or customers, we should always be asking the question: "who told you
this was a good idea?"

-- 
Tom


Re: "Tactical" /24 announcements

2021-08-11 Thread Tom Hill
On 10/08/2021 12:31, Mark Tinka wrote:
> Been waiting for the day when /27's, /28's and /29's are going to make
> it into the DFZ, as was promised 5 or more years ago :-).


2914 permit you to leak prefixes as specific as a /28 between your own
ports with them. Someone once referred to it as a 'sneaky backhaul',
believe. Given that there's no default in 2914, I guess that counts? :D

-- 

I'm really not being serious. A nice feature by NTT, but please let's
never make it OK to populate the _actual_ DFZ with an IPv4 prefix
greater than a /24.

-- 
Tom


Re: "Tactical" /24 announcements

2021-08-10 Thread Mark Tinka




On 8/9/21 19:38, Tom Beecher wrote:

Folks can announce longer than 24 masks all day. They're unlikely to 
propagate very far though, since most won't accept longer than 24 from 
the world at large.


Been waiting for the day when /27's, /28's and /29's are going to make 
it into the DFZ, as was promised 5 or more years ago :-).


Mark.


Re: "Tactical" /24 announcements

2021-08-10 Thread Masataka Ohta

Sabri Berisha wrote:


Just for fun, I did the math. A total of 16,777,216 /24s fit in 32
bits. Take away all the reserved space as per IANA (this is 1,266,696
/24s, see below),


> 240.0.0.0/41048576

I think we should also take away multicast addresses of

> 224.0.0.0/41048576

because multicast route can not be aggregated and must
be treated as /32.

Anyway,


The largest FIB table I have seen (hi Jim!) was 3,563,546 routes in
hardware. This was in a lab environment, of course.


for /24, these days, having 16M entry SRAM (simple one, not
TCAM) is trivially easy.

Masataka Ohta


Re: "Tactical" /24 announcements

2021-08-10 Thread Lukas Tribus
On Mon, 9 Aug 2021 at 17:47, Billy Croan  wrote:
> Are there any big networks that drop or penalize announcements like this?

It's possible you could get your peering request denied for this. I
have put *reasonable* prefix aggregation into peering requirements for
some years now. If you are a small eyeball network with 8192 IP
addresses and originate 32 /24's, that is *not* reasonable.



On Mon, 9 Aug 2021 at 17:47, Billy Croan  wrote:
> to originate everything as distinct /24 prefixes, to reduce the effect of
> a potential bgp hijack.

Some men just want to see the world burn.


lukas


Re: "Tactical" /24 announcements

2021-08-09 Thread Lady Benjamin Cannon of Glencoe, ASCE
This will break the internet at scale. No.

Ms. Lady Benjamin PD Cannon of Glencoe, ASCE
6x7 Networks & 6x7 Telecom, LLC 
CEO 
l...@6by7.net
"The only fully end-to-end encrypted global telecommunications company in the 
world.”

FCC License KJ6FJJ

Sent from my iPhone via RFC1149.

> On Aug 9, 2021, at 5:20 PM, Robert McKay  wrote:
> 
> On 2021-08-09 22:39, Baldur Norddahl wrote:
>> man. 9. aug. 2021 22.13 skrev Grzegorz Janoszka
>> :
 On 2021-08-09 17:47, Billy Croan wrote:
> How does the community feel about using /24 originations in BGP as
>>> a
 tactical advantage against potential bgp hijackers?
>>> RPKI is more effective than a competing /24. Unless they hijack you
>>> ASn
>>> as well.
>> You will usually get an as path length advantage even if they do
>> hijack your asn.
> 
> Unless your RPKI is set to allow /24 but you normally advertise /21 or 
> something shorter.. then RPKI works to the hijacker's advantage.
> 
> You could argue this is no different than before RPKI which is true.. except 
> that now that RPKI exists people are tempted to use it to automate 
> configuration and take humans out of the loop.
> 
> I imagine there are quite a few RPKI enabled prefixes (those configured to 
> allow too long advertisements) that are easier to hijack now than they were 
> before RPKI existed.
> 
> -Rob


Re: "Tactical" /24 announcements

2021-08-09 Thread Robert McKay

On 2021-08-09 22:39, Baldur Norddahl wrote:

man. 9. aug. 2021 22.13 skrev Grzegorz Janoszka
:


On 2021-08-09 17:47, Billy Croan wrote:

How does the community feel about using /24 originations in BGP as

a

tactical advantage against potential bgp hijackers?


RPKI is more effective than a competing /24. Unless they hijack you
ASn
as well.


You will usually get an as path length advantage even if they do
hijack your asn.


Unless your RPKI is set to allow /24 but you normally advertise /21 or 
something shorter.. then RPKI works to the hijacker's advantage.


You could argue this is no different than before RPKI which is true.. 
except that now that RPKI exists people are tempted to use it to 
automate configuration and take humans out of the loop.


I imagine there are quite a few RPKI enabled prefixes (those configured 
to allow too long advertisements) that are easier to hijack now than 
they were before RPKI existed.


-Rob


Re: "Tactical" /24 announcements

2021-08-09 Thread Baldur Norddahl
man. 9. aug. 2021 22.13 skrev Grzegorz Janoszka :

> On 2021-08-09 17:47, Billy Croan wrote:
> > How does the community feel about using /24 originations in BGP as a
> > tactical advantage against potential bgp hijackers?
>
> RPKI is more effective than a competing /24. Unless they hijack you ASn
> as well.
>

You will usually get an as path length advantage even if they do hijack
your asn.

Regards

Baldur


Re: "Tactical" /24 announcements

2021-08-09 Thread Grzegorz Janoszka

On 2021-08-09 17:47, Billy Croan wrote:

How does the community feel about using /24 originations in BGP as a
tactical advantage against potential bgp hijackers?


RPKI is more effective than a competing /24. Unless they hijack you ASn 
as well.


--
Grzegorz Janoszka


Re: "Tactical" /24 announcements

2021-08-09 Thread Amir Herzberg
Bill said,

> > Is this seen as route table pollution, or a necessary evil in today's
> world?
>
> Pollution. And it won't save you from a hijack either, since your
> adversary's /24 routes will compete and win for at least part of the
> Internet.
>

I agree, of course, that moving to announce every /24 would pollute the
net. Note that if you use ROAs, you'll also have to make corresponding /24
ROAs, and I don't know if this won't have problematic impact also on the
RPKI infrastructure. Not good.

But:
- assuming the /24 will have proper ROA, and ROV is reasonably deployed,
this _would_ protect most of the traffic sent to the /24 from a hijacker
announcing /24 (and even more if hijack is of shorter prefix, of course).
- As long as ROV isn't _very_ widely deployed, it would often fail to
protect against the hijack without such measure (competing /24), so this
will remain necessary (if you wish to prevent hijack).

We've done some relevant simulations, as well as proposed a simple
extension to ROV, called ROV++, which protects against such sub-prefix
hijacks without requiring competing /24 announcement, and effective already
with modest adoption (of ROV++) by BGP routers. (Should also be assisted by
mixed ROV / ROV++ adoption but we didn't do these simulations yet.)

See at:
https://www.ndss-symposium.org/ndss-paper/rov-improved-deployable-defense-against-bgp-hijacking/

tl; dr : ROV++ routers would blackhole subprefix traffic rather than send
it on a route which would be hijacked (i.e., if the route is to a neighbor
AS that announced legit prefix _and_ hijacked subprefix). Simple.

[and no, I'm not happy with the resulting disconnections. but it's better
than hijack imho]

best, Amir
-- 
Amir Herzberg

Comcast professor of Security Innovations, Computer Science and
Engineering, University of Connecticut
Homepage: https://sites.google.com/site/amirherzberg/home
`Applied Introduction to Cryptography' textbook and lectures:
 https://sites.google.com/site/amirherzberg/applied-crypto-textbook





On Mon, Aug 9, 2021 at 12:10 PM William Herrin  wrote:

> On Mon, Aug 9, 2021 at 8:48 AM Billy Croan 
> wrote:
> > How does the community feel about using /24 originations in BGP as a
> > tactical advantage against potential bgp hijackers?
> > How many routers out there today would be affected if everyone did this?
>
> Hi Billy,
>
> I did some math on this years ago and it worked out to about 8.5
> million IPv4 routes. That's 10 times the current table size, more than
> any big-iron router can handle today. If everybody did it, it'd crash
> the Internet.
>
> > Is this seen as route table pollution, or a necessary evil in today's
> world?
>
> Pollution. And it won't save you from a hijack either, since your
> adversary's /24 routes will compete and win for at least part of the
> Internet.
>
> > Are there any big networks that drop or penalize announcements like this?
>
> Not in an automated way. Which is bad news for you if you do this
> because it means getting folks to -undo- the restrictions they
> manually enforce on your specific address space is nearly impossible.
>
> Regards,
> Bill Herrin
>
> --
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/
>


Re: "Tactical" /24 announcements

2021-08-09 Thread William Herrin
On Mon, Aug 9, 2021 at 10:31 AM Sabri Berisha  wrote:
> Just for fun, I did the math. A total of 16,777,216 /24s fit in 32 bits. Take 
> away all the reserved space as per IANA (this is 1,266,696 /24s, see below), 
> and we end up with 16,777,216 - 1,266,696 = 15,510,520 potential /24 
> advertisements.

Howdy,

It's not that simple. For example, 224/4 is not a 'reserved' space but
it can't appear in the unicast BGP table either. That alone is a
million routes unaccounted for in your math.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: "Tactical" /24 announcements

2021-08-09 Thread Rabbi Rob Thomas
Dear team,

I have resorted to more specific announcements during hijacks, though
with only one purpose in mind:  To buy us a bit of time while the
upstreams and peers put blocks in place to thwart the hijack as close to
the source as possible.  The more specifics are an imperfect solution,
since they don't always propagate as widely or as quickly as the
hijacks, but it buys us a bit of time.

The more important part of that solution is to network with fellow
network operators.  This is my go-to solution for everything from
hijacking to DDoS to "what the heck is that?!"  :)

Be well,
Rabbi Rob.


On 8/9/21 1:38 PM, Tom Beecher wrote:
> Folks can announce longer than 24 masks all day. They're unlikely to
> propagate very far though, since most won't accept longer than 24 from
> the world at large.
> 
> To the OP, there are some valid reasons to strategically deaggregate
> here and there, but a blanket "yolo my entire allocation into /24s"
> seems to be a pretty ill considered request.
> 
> On Mon, Aug 9, 2021 at 1:34 PM Hank Nussbacher  > wrote:
> 
> On 09/08/2021 18:47, Billy Croan wrote:
> > How does the community feel about using /24 originations in BGP as a
> > tactical advantage against potential bgp hijackers?
> >
> > All of our allocations are larger and those prefixes we announce for
> > clients as well usually are.  But we had a request recently to
> > originate everything as distinct /24 prefixes, to reduce the effect of
> > a potential bgp hijack.  It seemed a little bit like a tragedy of the
> > commons situation.
> >
> > Is this seen as route table pollution, or a necessary evil in
> today's world?
> > How many routers out there today would be affected if everyone did
> this?
> > Are there any big networks that drop or penalize announcements
> like this?
> >
> 
> In addition to what everyone else said, announcing /24s will not help
> you one bit since ASNs announce /25s, /26s, /27s, etc. Attached is a
> 7800+ line text file sorted by ASN with prefixes being announced that
> are more specific than /24 (only /25+/26+/27 listed).
> 
> This is based on http://www.ris.ripe.net/dumps/riswhoisdump.IPv4.gz
>  from
> about a month ago.
> 
> That dump lists all the IPv4 prefixes seen in the collective of latest
> RIS table dumps, together with origin AS and number of peers that
> passed
> the routes to RIS.
> 
> So good luck with announcing /24s.
> 
> Regards,
> Hank
> 

-- 
Rabbi Rob Thomas   Team Cymru
   "It is easy to believe in freedom of speech for those with whom we
agree." - Leo McKern



OpenPGP_signature
Description: OpenPGP digital signature


Re: "Tactical" /24 announcements

2021-08-09 Thread Chris Cummings
I prefer the approach of disaggregating only when needed, not as a
preventative measure. There are tools that can help with automating this
disaggregation (ARTEMIS can do this, for example).

—
Chris


On Mon, Aug 9, 2021 at 10:50 AM Billy Croan 
wrote:

> How does the community feel about using /24 originations in BGP as a
> tactical advantage against potential bgp hijackers?
>
> All of our allocations are larger and those prefixes we announce for
> clients as well usually are.  But we had a request recently to
> originate everything as distinct /24 prefixes, to reduce the effect of
> a potential bgp hijack.  It seemed a little bit like a tragedy of the
> commons situation.
>
> Is this seen as route table pollution, or a necessary evil in today's
> world?
> How many routers out there today would be affected if everyone did this?
> Are there any big networks that drop or penalize announcements like this?
>


Re: "Tactical" /24 announcements

2021-08-09 Thread Tom Beecher
Folks can announce longer than 24 masks all day. They're unlikely to
propagate very far though, since most won't accept longer than 24 from the
world at large.

To the OP, there are some valid reasons to strategically deaggregate here
and there, but a blanket "yolo my entire allocation into /24s" seems to be
a pretty ill considered request.

On Mon, Aug 9, 2021 at 1:34 PM Hank Nussbacher  wrote:

> On 09/08/2021 18:47, Billy Croan wrote:
> > How does the community feel about using /24 originations in BGP as a
> > tactical advantage against potential bgp hijackers?
> >
> > All of our allocations are larger and those prefixes we announce for
> > clients as well usually are.  But we had a request recently to
> > originate everything as distinct /24 prefixes, to reduce the effect of
> > a potential bgp hijack.  It seemed a little bit like a tragedy of the
> > commons situation.
> >
> > Is this seen as route table pollution, or a necessary evil in today's
> world?
> > How many routers out there today would be affected if everyone did this?
> > Are there any big networks that drop or penalize announcements like this?
> >
>
> In addition to what everyone else said, announcing /24s will not help
> you one bit since ASNs announce /25s, /26s, /27s, etc. Attached is a
> 7800+ line text file sorted by ASN with prefixes being announced that
> are more specific than /24 (only /25+/26+/27 listed).
>
> This is based on http://www.ris.ripe.net/dumps/riswhoisdump.IPv4.gz from
> about a month ago.
>
> That dump lists all the IPv4 prefixes seen in the collective of latest
> RIS table dumps, together with origin AS and number of peers that passed
> the routes to RIS.
>
> So good luck with announcing /24s.
>
> Regards,
> Hank
>


Re: "Tactical" /24 announcements

2021-08-09 Thread Hank Nussbacher

On 09/08/2021 18:47, Billy Croan wrote:

How does the community feel about using /24 originations in BGP as a
tactical advantage against potential bgp hijackers?

All of our allocations are larger and those prefixes we announce for
clients as well usually are.  But we had a request recently to
originate everything as distinct /24 prefixes, to reduce the effect of
a potential bgp hijack.  It seemed a little bit like a tragedy of the
commons situation.

Is this seen as route table pollution, or a necessary evil in today's world?
How many routers out there today would be affected if everyone did this?
Are there any big networks that drop or penalize announcements like this?



In addition to what everyone else said, announcing /24s will not help 
you one bit since ASNs announce /25s, /26s, /27s, etc. Attached is a 
7800+ line text file sorted by ASN with prefixes being announced that 
are more specific than /24 (only /25+/26+/27 listed).


This is based on http://www.ris.ripe.net/dumps/riswhoisdump.IPv4.gz from 
about a month ago.


That dump lists all the IPv4 prefixes seen in the collective of latest 
RIS table dumps, together with origin AS and number of peers that passed 
the routes to RIS.


So good luck with announcing /24s.

Regards,
Hank


more-specifics-sorted-by-asn.7z
Description: Binary data


Re: "Tactical" /24 announcements

2021-08-09 Thread Sabri Berisha
- On Aug 9, 2021, at 9:22 AM, Masataka Ohta 
mo...@necom830.hpcl.titech.ac.jp wrote:

Hi,

> It should be 14M.

Just for fun, I did the math. A total of 16,777,216 /24s fit in 32 bits. Take 
away all the reserved space as per IANA (this is 1,266,696 /24s, see below), 
and we end up with 16,777,216 - 1,266,696 = 15,510,520 potential /24 
advertisements.

The largest FIB table I have seen (hi Jim!) was 3,563,546 routes in hardware. 
This was in a lab environment, of course.

Thanks,

Sabri


https://www.iana.org/assignments/iana-ipv4-special-registry/iana-ipv4-special-registry.xhtml
 
Subnet  Number of /24s

0.0.0.0/8   65536
10.0.0.0/8  65536
100.64.0.0/10   16384
127.0.0.0/8 65536
169.254.0.0/16  256
172.16.0.0/12   4096
192.0.0.0/241
192.0.2.0/241
192.31.196.0/24 1
192.52.193.0/24 1
192.88.99.0/24  1
192.168.0.0/16  256
192.175.48.0/24 1
198.18.0.0/15   512
198.51.100.0/24 1
203.0.113.0/24  1
240.0.0.0/4 1048576

Total reserved  1,266,696


Re: "Tactical" /24 announcements

2021-08-09 Thread William Herrin
On Mon, Aug 9, 2021 at 9:24 AM Masataka Ohta
 wrote:
> William Herrin wrote:
> > I did some math on this years ago and it worked out to about 8.5
> > million IPv4 routes.
>
> It should be 14M.

Doubtful. Like I said, I did the math. The question I asked at the time was:

If:
IPv6 fails to overtake IPv4 and
IPv4 continues to be divided and redistributed to progressively
higher-value uses and
the /24 public Internet announcement boundary holds then

What will the terminal size of the IPv4 Internet BGP table be?


There are 2^24 = 16.8M /24s in the IPv4 address space.

Many of these are reserved for non-unicast uses, e.g. 224/3, 0/8
Many of the unicast addresses are reserved for non-public uses, e.g. 10/8, 127/8
Some portion of the assigned address space is used off-Internet in
valuable enough ways that its owners are unlikely to release it for
use on the Internet.
Some portion of the address space won't be disaggregated to /24
because their owners won't find it convenient
Some portion of the address space will have overlapping announcements
(/24s and the overlapping the /20, that sort of thing)

I no longer have the exact formula but I made some reasonable
assumptions for each of these factors and it worked out to 8.5 million
as the probable terminal size of the IPv4 table. There were error
bands too, I forget what they were, but nothing came even close to
14M.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: "Tactical" /24 announcements

2021-08-09 Thread Masataka Ohta

William Herrin wrote:


I did some math on this years ago and it worked out to about 8.5
million IPv4 routes.


It should be 14M.

Worse, it will be reached unless we stop doing multihoming by
routing, which is selfish.

Masataka Ohta


Re: "Tactical" /24 announcements

2021-08-09 Thread Adam Thompson
Yes, it is bad practice.  Yes, it's polluting the route table.
If the # of /24s involved is not ridiculously large (say, <64?) them I would go 
ahead, as long as IRR and/or RPKI are also updated.
Obviously if everyone did it (i.e. advertising /24s exclusively) then our FIBs 
would collectively balloon to a grotesquely un-manageable size, at least on 
platforms that can't auto-aggregate that back down.  Thankfully, everyone isn't 
doing it.
I, too, would vastly prefer no-one did this, but I have two customers that 
demand it from time to time... and we've even done it for our own allocation 
sometimes, and there's no robust, never mind bullet-proof, technical argument 
why I can't do that for them (or for ourselves).  OTOH robust arguments exist 
for why it's a good thing to do - sometimes, and temporarily.
¯\_(ツ)_/¯
-Adam


Adam Thompson
Consultant, Infrastructure Services
[1593169877849]
100 - 135 Innovation Drive
Winnipeg, MB, R3T 6A8
(204) 977-6824 or 1-800-430-6404 (MB only)
athomp...@merlin.mb.ca<mailto:athomp...@merlin.mb.ca>
www.merlin.mb.ca<http://www.merlin.mb.ca/>

From: NANOG  on behalf of Billy 
Croan 
Sent: August 9, 2021 10:47
To: nanog list 
Subject: "Tactical" /24 announcements

How does the community feel about using /24 originations in BGP as a
tactical advantage against potential bgp hijackers?

All of our allocations are larger and those prefixes we announce for
clients as well usually are.  But we had a request recently to
originate everything as distinct /24 prefixes, to reduce the effect of
a potential bgp hijack.  It seemed a little bit like a tragedy of the
commons situation.

Is this seen as route table pollution, or a necessary evil in today's world?
How many routers out there today would be affected if everyone did this?
Are there any big networks that drop or penalize announcements like this?


Re: "Tactical" /24 announcements

2021-08-09 Thread Saku Ytti
On Mon, 9 Aug 2021 at 19:07, Martijn Schmidt via NANOG  wrote:

> It's route table pollution if you ask me.. in today's world we have many
> IXPs and several tier-1 operators that support RPKI ROV, so when you
> have issued ROAs for the supernet of the IP space in question it'll
> already significantly reduce the effects of a BGP hijack.

Not just a route table.

- RIB scale
- FIB scale
- Configuration scale

We just recently learned of a IOS-XR prefix-set limit of 31 when a
particular customer AS-SET expanded to a higher number of prefixes.

The problem with this scaling is that it doesn't reflect an increase
of revenue but it reflects an increase of cost. CAPEX to upgrade
devices without winning new business and OPEX to manage those
upgrades.

So it is a somewhat selfish solution to a problem.

-- 
  ++ytti


Re: "Tactical" /24 announcements

2021-08-09 Thread William Herrin
On Mon, Aug 9, 2021 at 8:48 AM Billy Croan  wrote:
> How does the community feel about using /24 originations in BGP as a
> tactical advantage against potential bgp hijackers?
> How many routers out there today would be affected if everyone did this?

Hi Billy,

I did some math on this years ago and it worked out to about 8.5
million IPv4 routes. That's 10 times the current table size, more than
any big-iron router can handle today. If everybody did it, it'd crash
the Internet.

> Is this seen as route table pollution, or a necessary evil in today's world?

Pollution. And it won't save you from a hijack either, since your
adversary's /24 routes will compete and win for at least part of the
Internet.

> Are there any big networks that drop or penalize announcements like this?

Not in an automated way. Which is bad news for you if you do this
because it means getting folks to -undo- the restrictions they
manually enforce on your specific address space is nearly impossible.

Regards,
Bill Herrin

-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: "Tactical" /24 announcements

2021-08-09 Thread Martijn Schmidt via NANOG
It's route table pollution if you ask me.. in today's world we have many 
IXPs and several tier-1 operators that support RPKI ROV, so when you 
have issued ROAs for the supernet of the IP space in question it'll 
already significantly reduce the effects of a BGP hijack.

Best regards,
Martijn

On 8/9/21 5:47 PM, Billy Croan wrote:
> How does the community feel about using /24 originations in BGP as a
> tactical advantage against potential bgp hijackers?
>
> All of our allocations are larger and those prefixes we announce for
> clients as well usually are.  But we had a request recently to
> originate everything as distinct /24 prefixes, to reduce the effect of
> a potential bgp hijack.  It seemed a little bit like a tragedy of the
> commons situation.
>
> Is this seen as route table pollution, or a necessary evil in today's world?
> How many routers out there today would be affected if everyone did this?
> Are there any big networks that drop or penalize announcements like this?



"Tactical" /24 announcements

2021-08-09 Thread Billy Croan
How does the community feel about using /24 originations in BGP as a
tactical advantage against potential bgp hijackers?

All of our allocations are larger and those prefixes we announce for
clients as well usually are.  But we had a request recently to
originate everything as distinct /24 prefixes, to reduce the effect of
a potential bgp hijack.  It seemed a little bit like a tragedy of the
commons situation.

Is this seen as route table pollution, or a necessary evil in today's world?
How many routers out there today would be affected if everyone did this?
Are there any big networks that drop or penalize announcements like this?