Re: On consistency and 192.0.0.0/24

2024-05-14 Thread Jakob Heitz (jheitz) via NANOG
RFC 5736 was obsoleted by RFC 6890.
It says in part:

2.2.1.  Information Requirements

   The IPv4 and IPv6 Special-Purpose Address Registries maintain the
   following information regarding each entry:
…
   o  Forwardable - A boolean value indicating whether a router may
  forward an IP datagram whose destination address is drawn from the
  allocated special-purpose address block between external
  interfaces.
…

That means that some IP addresses in the block 192.0.0.0/24 may be routable.
So, I would not make this a bogon.

A better way to filter IP routes is by policy, for example based upon
IRR and RPKI records.

Kind Regards,
Jakob

-- Original message --
Date: Tue, 14 May 2024 12:00:15 +0200 (CEST)
From: b...@uu3.net


[10] 192.0.0.0/24 reserved for IANA IPv4 Special Purpose Address Registry
[RFC5736]. Complete registration details for 192.0.0.0/24 are found in
[IANA registry iana-ipv4-special-registry].

Was RFC5736 obsoleted? I think not, so I would treat it as bogon.

Its a nice tiny subnet for special purposes. I personaly use it
as my Internal VM Net on my desktop for example.


-- Original message --

From: John Kristoff 
To: NANOG 
Subject: On consistency and 192.0.0.0/24
Date: Mon, 13 May 2024 16:18:47 -0500

As one to never let a good academic question go unasked... what is it
about 192.0.0.0/24 that is or isn't a bogon. This doesn't seem so
straightforward an answer to me, at least in theory.  Although in
practice it may already be decided whether one likes the answer or not.

192.0.0.0/24 was originally assigned to IANA for "protocol assignments"
in IETF RFC 5736, and later added to the list of reserved / special use
addresses in IETF RFC 6890 (aka BCP 153).   There is a corresponding
IPv6 block (2001::/23), but it has a significantly different history.

Team Cymru's bogon list includes the v4 prefix.  NLNOG's bogon
filtering guide does not.  When I asked Job about NLNOG's position he
said:

  "I was unsure what this prefix??s future plans would be and erred on
  the side of caution and didn??t include this prefix in the NLNOG bogon
  list recommendations."

The /24 as specified is not for "global" use, but some of the more
specific assignments are or can be.  See:
.

>From my cursory examination I can't find cases where the v4 prefix or
more specifics have been publicly announced to any significant degree.
This however is not the case for the IPv6 prefix (e.g., the AS112
project, Teredo).

Maybe you'd say the /24 should be filtered, but not the more specifics
that are deemed available for global use.  That might be reasonable,
except many reasonable people will filter small prefixes.

IANA's language may have put any "do not filter" camp in a relatively
weak position:

  "Address prefixes listed in the Special-Purpose Address Registry are
  not guaranteed routability in any particular local or global context."

I can't remember hearing anyone complaining about bogon-related
reachability problems with the aggregate IANA prefixes generally.  Is
there a strong case to make that ops should not bogon filter any
addresses in these prefixes?  At least with IPv4?  What about for IPv6?

John



Re: NANOG Digest, Vol 193, Issue 1

2024-02-01 Thread Jakob Heitz (jheitz) via NANOG
Wow!
The reason it’s called generative AI is because it totally made that up.

Kind Regards,
Jakob


Date: Wed, 31 Jan 2024 18:27:24 +
From: "Compton, Rich" 
To: Mohammad Khalil , NANOG list 
Subject: Re: SOVC - BGp RPKI
Message-ID:



Content-Type: text/plain; charset="utf-8"

ChatGPT says:
SOVC in the context of RPKI (Resource Public Key Infrastructure) on a Cisco 
router stands for "Stale Origin Validation Cache". RPKI is a security framework 
designed to secure the Internet's routing infrastructure, primarily through 
route origin validation. It ensures that the Internet number resources (like IP 
addresses and AS numbers) are used by the legitimate owners or authorized AS 
(Autonomous System).
In RPKI, Route Origin Authorizations (ROAs) are used to define which AS is 
authorized to announce a specific IP address block. Network devices, like Cisco 
routers, use these ROAs to validate the authenticity of BGP (Border Gateway 
Protocol) route announcements.
The term "stale" in SOVC refers to a situation where the router's 
RPKI-to-Router protocol client has lost its connection to the RPKI server, or 
when the RPKI cache data is outdated and not refreshed for some reason. This 
can happen due to network issues, configuration errors, or problems with the 
RPKI server itself. When the RPKI cache is stale, the router cannot reliably 
validate BGP route announcements against the latest ROA data, potentially 
affecting routing decisions.
In a network security context, maintaining an up-to-date RPKI cache is crucial 
for ensuring that the network only accepts legitimate routing announcements, 
thereby reducing the risk of routing hijacks or misconfigurations. As a network 
security engineer, managing and monitoring the RPKI status on routers is an 
important aspect of ensuring network security and integrity.





Re: SOVC - BGp RPKI

2024-02-01 Thread Jakob Heitz (jheitz) via NANOG
In bgp_sovc.h, at the top, it says:
BGP Secure Origin Validation Code
Further down in the file, it says:
BGP Secured Origin Validate Cache – SOVC

Basically, the router downloads the VRPs from the RPKI server, using RFC 6810.
Then it uses the downloaded VRPs to validate received routes using RFC 6811.
SOVC refers to the code that does that.

Kind Regards,
Jakob


Date: Wed, 31 Jan 2024 16:16:15 +0300
From: Mohammad Khalil 
To: NANOG list 

Greetings
Am have tried to find out what is the abbreviation for SOVC with no luck.

#sh bgp ipv4 unicast rpki servers
BGP SOVC neighbor is X.X.X.47/323 connected to port 323

Anyone have encountered this?

Thanks!



Re: maximum ipv4 bgp prefix length of /24 ?

2023-10-02 Thread Jakob Heitz (jheitz) via NANOG
On a related note, I'm working on a project to handle FIB overflow in
such a way as to cause the least disruption in the network.

I welcome suggestions either on or off list.

Kind Regards,
Jakob



Re: maximum ipv4 bgp prefix length of /24 ?

2023-10-01 Thread Jakob Heitz (jheitz) via NANOG
While I did allude to some of the complexity, my main point
is that FIB compression does not allow you to install a FIB with less memory.
Because you must be prepared for transients during which the FIB needs to store
mostly uncompressed anyway.
All it does is to increase convergence time.

Kind Regards,
Jakob


From: William Herrin 
Date: Sunday, October 1, 2023 at 6:32 PM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org 
Subject: Re: maximum ipv4 bgp prefix length of /24 ?
On Sun, Oct 1, 2023 at 5:40 PM Jakob Heitz (jheitz) via NANOG
 wrote:
> Among the issues:
> Suppose the FIB has all the /24 components to make a /20, so it programs a 
> /20.
> Then one of the /24's changes nexthop. It now has to undo all that compression

Yeah... all this stuff is on the same level of complexity as
implementing a B-Tree. Standard task on the road to an undergraduate
computer science degree. Compared to decoding a BGP update message,
where nearly everything is variable length and you have to nibble away
at the current field to find the start of the next field, this is a
cakewalk.

It doesn't actually get complicated until you want to do more than
just joining adjacent address blocks.

Regards,
Bill Herrin




--
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: maximum ipv4 bgp prefix length of /24 ?

2023-10-01 Thread Jakob Heitz (jheitz) via NANOG
Among the issues:
Suppose the FIB has all the /24 components to make a /20, so it programs a /20.
Then one of the /24's changes nexthop. It now has to undo all that compression
by reinstalling some of the routes and figuring out the minimum set of /21, 
/22, /23, /24
to make it happen. Then to avoid a transient, it needs to make before break.
Quite a bit of FIB programming needs to happen just to modify a single /24.
Then the next /24 in the set also modifies its nexthop. and so on for 10 
routes.
All because a peer link flapped.
Affecting convergence.
Then you need to buy a line card that can hold all the individual routes, 
because you
can't always compress, because not all the routes in your compressed set have 
the
same nexthop during a transient.
Finally, it's all nicely compressed.
Now what? You have lots of empty slots in your FIB.
I'm sure lots of nerds can come up with transient reduction algorithms, but I'd 
rather not.

Kind Regards,
Jakob

---
Date: Sat, 30 Sep 2023 20:04:29 -0700
From: Owen DeLong 

Not sure why you think FIB compression is a risk or will be a mess. It?s a 
pretty straightforward task.

Owen



Re: maximum ipv4 bgp prefix length of /24 ?

2023-09-29 Thread Jakob Heitz (jheitz) via NANOG
Each unit of mask length increase doubles the size of the table theoretically.
About 60% of the table is /24 routes.
Just going to /25 will probably double the table size.
Not sure I'd like to extrapolate the estimate out to /27.

Kind Regards,
Jakob

-
Date: Fri, 29 Sep 2023 00:25:55 +0300
From: VOLKAN SAL?H 
To: nanog@nanog.org

hello,

I believe, ISPs should also allow ipv4 prefixes with length between
/25-/27 instead of limiting maximum length to /24..

I also believe that RIRs and LIRs should allocate /27s which has 32 IPv4
address. considering IPv4 world is now mostly NAT'ed, 32 IPv4s are
sufficient for most of the small and medium sized organizations and also
home office workers like youtubers, and professional gamers and webmasters!

It is because BGP research and experiment networks can not get /24 due
to high IPv4 prices, but they have to get an IPv4 prefix to learn BGP in
IPv4 world.

What do you think about this?

What could be done here?

Is it unacceptable; considering most big networks that do
full-table-routing also use multi-core routers with lots of RAM? those
would probably handle /27s and while small networks mostly use default
routing, it should be reasonable to allow /25-/27?

Thanks for reading, regards..



Re: JunOS/FRR/Nokia et al BGP critical issue

2023-08-30 Thread Jakob Heitz (jheitz) via NANOG
You may treat-as-withdraw instead of discard.
However, this attribute does not affect routing.
It only affects whether a sender of packets to the route will add the entropy
label or not to the MPLS header, if such an MPLS header is added.
Therefore, it is safe to discard the attribute.

Kind Regards,
Jakob


From: Jakob Heitz (jheitz) 
Date: Wednesday, August 30, 2023 at 8:15 AM
To: nanog@nanog.org 
Subject: Re: JunOS/FRR/Nokia et al BGP critical issue
IOS-XR passes on the attribute by default.
Some other routers incorrectly claim it to be malformed and reset the BGP 
session.
IOS-XR has a configuration to discard an attribute, so it will not pass it on.
It will pass the route with all its other attributes.
Here is an example configuration:

router bgp {asn}
attribute-filter group block_elc
  attribute 28 discard
!
neighbor {ip address}
  update in filtering
   attribute-filter group block_elc
  !
!
!

More info:
https://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/routing/command/reference/b-routing-cr-asr9000/bgp-commands.html#wp3145726977
https://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r7-8/routing/configuration/guide/b-routing-cg-asr9000-78x/implementing-bgp.html#concept_77EE033C2F0C4BDDB8423C25FA71E3F9


Kind Regards,
Jakob


From: Jakob Heitz (jheitz) 
Date: Wednesday, August 30, 2023 at 7:43 AM
To: nanog@nanog.org 
Subject: Re: JunOS/FRR/Nokia et al BGP critical issue
The blog was updated. Correct link:
https://blog.benjojo.co.uk/post/bgp-path-attributes-grave-error-handling
The attribute was not malformed.
This is the hex dump of the attribute: “E0 1C 00”
It is described here.
https://www.rfc-editor.org/rfc/rfc6790#section-5.2
This attribute is deprecated, but that does not prevent routers from 
originating it or passing it on.

Kind Regards,
Jakob

- Original message --
From: Mike Lyon 
To: NANOG list 

Ran across this article today and haven't seen posts about it so i
figured I would share:

https://blog.benjojo.co.uk/post/bgp-path-attributes-grave-error-handling?fbclid=IwAR13ePY43Vf3u4X8PDyCDT39DtyXczAKkv6CGXOQbcQv90Y3aIAmTkJxn7k_aem_Ad0hzj2Mh_WlbFZug-vGdlJJdXr2Xo0RFIsPwAU2GviPz6xZDib76YHwFuzU7E0_sJk=Zxz2cZ

Curious if anyone on the list is running VyOS and has experienced any problems?

Cheers,
Mike

--
Mike Lyon
mike.l...@gmail.com
http://www.linkedin.com/in/mlyon





Re: JunOS/FRR/Nokia et al BGP critical issue

2023-08-30 Thread Jakob Heitz (jheitz) via NANOG
IOS-XR passes on the attribute by default.
Some other routers incorrectly claim it to be malformed and reset the BGP 
session.
IOS-XR has a configuration to discard an attribute, so it will not pass it on.
It will pass the route with all its other attributes.
Here is an example configuration:

router bgp {asn}
attribute-filter group block_elc
  attribute 28 discard
!
neighbor {ip address}
  update in filtering
   attribute-filter group block_elc
  !
!
!

More info:
https://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/routing/command/reference/b-routing-cr-asr9000/bgp-commands.html#wp3145726977
https://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r7-8/routing/configuration/guide/b-routing-cg-asr9000-78x/implementing-bgp.html#concept_77EE033C2F0C4BDDB8423C25FA71E3F9


Kind Regards,
Jakob


From: Jakob Heitz (jheitz) 
Date: Wednesday, August 30, 2023 at 7:43 AM
To: nanog@nanog.org 
Subject: Re: JunOS/FRR/Nokia et al BGP critical issue
The blog was updated. Correct link:
https://blog.benjojo.co.uk/post/bgp-path-attributes-grave-error-handling
The attribute was not malformed.
This is the hex dump of the attribute: “E0 1C 00”
It is described here.
https://www.rfc-editor.org/rfc/rfc6790#section-5.2
This attribute is deprecated, but that does not prevent routers from 
originating it or passing it on.

Kind Regards,
Jakob

- Original message --
From: Mike Lyon 
To: NANOG list 

Ran across this article today and haven't seen posts about it so i
figured I would share:

https://blog.benjojo.co.uk/post/bgp-path-attributes-grave-error-handling?fbclid=IwAR13ePY43Vf3u4X8PDyCDT39DtyXczAKkv6CGXOQbcQv90Y3aIAmTkJxn7k_aem_Ad0hzj2Mh_WlbFZug-vGdlJJdXr2Xo0RFIsPwAU2GviPz6xZDib76YHwFuzU7E0_sJk=Zxz2cZ

Curious if anyone on the list is running VyOS and has experienced any problems?

Cheers,
Mike

--
Mike Lyon
mike.l...@gmail.com
http://www.linkedin.com/in/mlyon




Re: JunOS/FRR/Nokia et al BGP critical issue

2023-08-30 Thread Jakob Heitz (jheitz) via NANOG
The blog was updated. Correct link:
https://blog.benjojo.co.uk/post/bgp-path-attributes-grave-error-handling
The attribute was not malformed.
This is the hex dump of the attribute: “E0 1C 00”
It is described here.
https://www.rfc-editor.org/rfc/rfc6790#section-5.2
This attribute is deprecated, but that does not prevent routers from 
originating it or passing it on.

Kind Regards,
Jakob

- Original message --
From: Mike Lyon 
To: NANOG list 

Ran across this article today and haven't seen posts about it so i
figured I would share:

https://blog.benjojo.co.uk/post/bgp-path-attributes-grave-error-handling?fbclid=IwAR13ePY43Vf3u4X8PDyCDT39DtyXczAKkv6CGXOQbcQv90Y3aIAmTkJxn7k_aem_Ad0hzj2Mh_WlbFZug-vGdlJJdXr2Xo0RFIsPwAU2GviPz6xZDib76YHwFuzU7E0_sJk=Zxz2cZ

Curious if anyone on the list is running VyOS and has experienced any problems?

Cheers,
Mike

--
Mike Lyon
mike.l...@gmail.com
http://www.linkedin.com/in/mlyon



Re: Destination Preference Attribute for BGP

2023-08-18 Thread Jakob Heitz (jheitz) via NANOG
Fact remains, operators scrub communities and path-attributes for many reasons.
That's why as-path length is used as a traffic engineering mechanism over 
multiple AS hops.
As limited as it is, it's what we have.

Kind Regards,
Jakob


From: Jakob Heitz (jheitz) 
Date: Friday, August 18, 2023 at 1:20 PM
To: Robert Raszuk 
Cc: nanog@nanog.org 
Subject: Re: Destination Preference Attribute for BGP
We support platforms of various capacities.
While we would all like to sell the large ones, people buy the cheap ones too.

Kind Regards,
Jakob


From: Robert Raszuk 
Date: Friday, August 18, 2023 at 12:55 PM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org 
Subject: Re: Destination Preference Attribute for BGP
Jakob,

Considering how much various junk is being added to BGP protocol these days 
communities are your least worry as far as RAM space and protocol convergence 
time would be of any concern. Then you have those new concepts of 
limited/trusted domains where blast radius of much higher caliber then what 
communities would ever reach extends across ASNs.

It is interesting that not many folks from this list are participating in IETF 
IDR WG and voice concerns in respect to new BGP extensions which in the vast 
majority has nothing to do with Interdomain IPv4 or IPv6 routing.

While it is great that you keep fixing bugs I would encourage your platform/RP 
designers to take a look at amazon memory and cpu prices and make RPs a bit 
more powerful than average smartphones.

Cheers,
R.

On Fri, Aug 18, 2023 at 8:05 PM Jakob Heitz (jheitz) 
mailto:jhe...@cisco.com>> wrote:
Perhaps to you Robert.
I work on code and with customer issues that escalate to code.

Kind Regards,
Jakob


From: Robert Raszuk mailto:rob...@raszuk.net>>
Date: Friday, August 18, 2023 at 10:59 AM
To: Jakob Heitz (jheitz) mailto:jhe...@cisco.com>>
Cc: nanog@nanog.org<mailto:nanog@nanog.org> 
mailto:nanog@nanog.org>>
Subject: Re: Destination Preference Attribute for BGP
Hi Jakob,

On Fri, Aug 18, 2023 at 7:41 PM Jakob Heitz (jheitz) via NANOG 
mailto:nanog@nanog.org>> wrote:
That's true Robert.
However, communities and med only work with neighbors.
Communities routinely get scrubbed because they cause increased memory usage 
and convergence time in routers.

Considering that we are talking about control plane memory I think the 
cost/space associated with storing communities is less then negligible these 
days.

And honestly with the number of BGP update generation optimizations I would not 
say that they contribute to longer protocol convergences in any measurable way.

To me this is more of the no trust and policy reasons why communities get 
dropped on the EBGP peerings.

Cheers,
R.






Even new path attributes get scrubbed, because there have been bugs related to 
new ones in the past.
Here is a config snippet in XR

router bgp 23456
attribute-filter group testAF
  attribute unrecognized discard
!
neighbor-group testNG
  update in filtering
   attribute-filter group testAF

The only thing that has any chance to go multiple ASes is as-path.
Need to be careful with that too because long ones get dropped.

route-policy testRP
  if as-path length ge 200 then
drop
  endif
end-policy

Kind Regards,
Jakob


From: Robert Raszuk mailto:rob...@raszuk.net>>
Date: Friday, August 18, 2023 at 12:38 AM
To: Jakob Heitz (jheitz) mailto:jhe...@cisco.com>>
Cc: nanog@nanog.org<mailto:nanog@nanog.org> 
mailto:nanog@nanog.org>>
Subject: Re: Destination Preference Attribute for BGP
Jakob,

With AS-PATH prepend you have no control on the choice of which ASN should do 
what action on your advertisements.

However, the practice of publishing communities by (some) ASNs along with their 
remote actions could be treated as an alternative to the DPA attribute. It 
could result in remote PREPEND action too.

If only those communities would not be deleted by some transit networks 

Thx,
R.

On Thu, Aug 17, 2023 at 9:46 PM Jakob Heitz (jheitz) via NANOG 
mailto:nanog@nanog.org>> wrote:
"prepend as-path" has taken its place.

Kind Regards,
Jakob


Date: Wed, 16 Aug 2023 21:42:22 +0200
From: Mark Tinka 

On 8/16/23 16:16, michael brooks - ESC wrote:

> Perhaps (probably) naively, it seems to me that DPA would have been a
> useful BGP attribute. Can anyone shed light on why this RFC never
> moved beyond draft status? I cannot find much information on this
> other than IETF's data tracker
> (https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-dpa/) and RFC6938
> (which implies DPA was in use,?but then was deprecated).

I've never heard of this draft until now, but reading it, I can see why
it would likely not be adopted today (not sure what the consensus would
have been back in the '90's).

DPA looks like MED on drugs.

Not sure operators want remote downstream ISP's arbitrarily choosing
which of their peering interconnects (and backbone links) carry traffic
from sour

Re: Destination Preference Attribute for BGP

2023-08-18 Thread Jakob Heitz (jheitz) via NANOG
We support platforms of various capacities.
While we would all like to sell the large ones, people buy the cheap ones too.

Kind Regards,
Jakob


From: Robert Raszuk 
Date: Friday, August 18, 2023 at 12:55 PM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org 
Subject: Re: Destination Preference Attribute for BGP
Jakob,

Considering how much various junk is being added to BGP protocol these days 
communities are your least worry as far as RAM space and protocol convergence 
time would be of any concern. Then you have those new concepts of 
limited/trusted domains where blast radius of much higher caliber then what 
communities would ever reach extends across ASNs.

It is interesting that not many folks from this list are participating in IETF 
IDR WG and voice concerns in respect to new BGP extensions which in the vast 
majority has nothing to do with Interdomain IPv4 or IPv6 routing.

While it is great that you keep fixing bugs I would encourage your platform/RP 
designers to take a look at amazon memory and cpu prices and make RPs a bit 
more powerful than average smartphones.

Cheers,
R.

On Fri, Aug 18, 2023 at 8:05 PM Jakob Heitz (jheitz) 
mailto:jhe...@cisco.com>> wrote:
Perhaps to you Robert.
I work on code and with customer issues that escalate to code.

Kind Regards,
Jakob


From: Robert Raszuk mailto:rob...@raszuk.net>>
Date: Friday, August 18, 2023 at 10:59 AM
To: Jakob Heitz (jheitz) mailto:jhe...@cisco.com>>
Cc: nanog@nanog.org<mailto:nanog@nanog.org> 
mailto:nanog@nanog.org>>
Subject: Re: Destination Preference Attribute for BGP
Hi Jakob,

On Fri, Aug 18, 2023 at 7:41 PM Jakob Heitz (jheitz) via NANOG 
mailto:nanog@nanog.org>> wrote:
That's true Robert.
However, communities and med only work with neighbors.
Communities routinely get scrubbed because they cause increased memory usage 
and convergence time in routers.

Considering that we are talking about control plane memory I think the 
cost/space associated with storing communities is less then negligible these 
days.

And honestly with the number of BGP update generation optimizations I would not 
say that they contribute to longer protocol convergences in any measurable way.

To me this is more of the no trust and policy reasons why communities get 
dropped on the EBGP peerings.

Cheers,
R.






Even new path attributes get scrubbed, because there have been bugs related to 
new ones in the past.
Here is a config snippet in XR

router bgp 23456
attribute-filter group testAF
  attribute unrecognized discard
!
neighbor-group testNG
  update in filtering
   attribute-filter group testAF

The only thing that has any chance to go multiple ASes is as-path.
Need to be careful with that too because long ones get dropped.

route-policy testRP
  if as-path length ge 200 then
drop
  endif
end-policy

Kind Regards,
Jakob


From: Robert Raszuk mailto:rob...@raszuk.net>>
Date: Friday, August 18, 2023 at 12:38 AM
To: Jakob Heitz (jheitz) mailto:jhe...@cisco.com>>
Cc: nanog@nanog.org<mailto:nanog@nanog.org> 
mailto:nanog@nanog.org>>
Subject: Re: Destination Preference Attribute for BGP
Jakob,

With AS-PATH prepend you have no control on the choice of which ASN should do 
what action on your advertisements.

However, the practice of publishing communities by (some) ASNs along with their 
remote actions could be treated as an alternative to the DPA attribute. It 
could result in remote PREPEND action too.

If only those communities would not be deleted by some transit networks 

Thx,
R.

On Thu, Aug 17, 2023 at 9:46 PM Jakob Heitz (jheitz) via NANOG 
mailto:nanog@nanog.org>> wrote:
"prepend as-path" has taken its place.

Kind Regards,
Jakob


Date: Wed, 16 Aug 2023 21:42:22 +0200
From: Mark Tinka 

On 8/16/23 16:16, michael brooks - ESC wrote:

> Perhaps (probably) naively, it seems to me that DPA would have been a
> useful BGP attribute. Can anyone shed light on why this RFC never
> moved beyond draft status? I cannot find much information on this
> other than IETF's data tracker
> (https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-dpa/) and RFC6938
> (which implies DPA was in use,?but then was deprecated).

I've never heard of this draft until now, but reading it, I can see why
it would likely not be adopted today (not sure what the consensus would
have been back in the '90's).

DPA looks like MED on drugs.

Not sure operators want remote downstream ISP's arbitrarily choosing
which of their peering interconnects (and backbone links) carry traffic
from source to them. BGP is a poor communicator of bandwidth and
shilling cost, in general. Those kinds of decisions tend to be locally
made, and permitting outside influence could be a rather hard sell.

It reminds me of how router vendors implemented GMPLS in the hopes that
optical operators would allow their customers to build and control
circuits in the optical domain in some fantastic fashion.

Or

Re: Destination Preference Attribute for BGP

2023-08-18 Thread Jakob Heitz (jheitz) via NANOG
Perhaps to you Robert.
I work on code and with customer issues that escalate to code.

Kind Regards,
Jakob


From: Robert Raszuk 
Date: Friday, August 18, 2023 at 10:59 AM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org 
Subject: Re: Destination Preference Attribute for BGP
Hi Jakob,

On Fri, Aug 18, 2023 at 7:41 PM Jakob Heitz (jheitz) via NANOG 
mailto:nanog@nanog.org>> wrote:
That's true Robert.
However, communities and med only work with neighbors.
Communities routinely get scrubbed because they cause increased memory usage 
and convergence time in routers.

Considering that we are talking about control plane memory I think the 
cost/space associated with storing communities is less then negligible these 
days.

And honestly with the number of BGP update generation optimizations I would not 
say that they contribute to longer protocol convergences in any measurable way.

To me this is more of the no trust and policy reasons why communities get 
dropped on the EBGP peerings.

Cheers,
R.






Even new path attributes get scrubbed, because there have been bugs related to 
new ones in the past.
Here is a config snippet in XR

router bgp 23456
attribute-filter group testAF
  attribute unrecognized discard
!
neighbor-group testNG
  update in filtering
   attribute-filter group testAF

The only thing that has any chance to go multiple ASes is as-path.
Need to be careful with that too because long ones get dropped.

route-policy testRP
  if as-path length ge 200 then
drop
  endif
end-policy

Kind Regards,
Jakob


From: Robert Raszuk mailto:rob...@raszuk.net>>
Date: Friday, August 18, 2023 at 12:38 AM
To: Jakob Heitz (jheitz) mailto:jhe...@cisco.com>>
Cc: nanog@nanog.org<mailto:nanog@nanog.org> 
mailto:nanog@nanog.org>>
Subject: Re: Destination Preference Attribute for BGP
Jakob,

With AS-PATH prepend you have no control on the choice of which ASN should do 
what action on your advertisements.

However, the practice of publishing communities by (some) ASNs along with their 
remote actions could be treated as an alternative to the DPA attribute. It 
could result in remote PREPEND action too.

If only those communities would not be deleted by some transit networks 

Thx,
R.

On Thu, Aug 17, 2023 at 9:46 PM Jakob Heitz (jheitz) via NANOG 
mailto:nanog@nanog.org>> wrote:
"prepend as-path" has taken its place.

Kind Regards,
Jakob


Date: Wed, 16 Aug 2023 21:42:22 +0200
From: Mark Tinka 

On 8/16/23 16:16, michael brooks - ESC wrote:

> Perhaps (probably) naively, it seems to me that DPA would have been a
> useful BGP attribute. Can anyone shed light on why this RFC never
> moved beyond draft status? I cannot find much information on this
> other than IETF's data tracker
> (https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-dpa/) and RFC6938
> (which implies DPA was in use,?but then was deprecated).

I've never heard of this draft until now, but reading it, I can see why
it would likely not be adopted today (not sure what the consensus would
have been back in the '90's).

DPA looks like MED on drugs.

Not sure operators want remote downstream ISP's arbitrarily choosing
which of their peering interconnects (and backbone links) carry traffic
from source to them. BGP is a poor communicator of bandwidth and
shilling cost, in general. Those kinds of decisions tend to be locally
made, and permitting outside influence could be a rather hard sell.

It reminds me of how router vendors implemented GMPLS in the hopes that
optical operators would allow their customers to build and control
circuits in the optical domain in some fantastic fashion.

Or how router vendors built Sync-E and PTP into their routers hoping
that they could sell timing as a service to mobile network operators as
part of a RAN backhaul service.

Some things just tend to be sacred.

Mark.


Re: Destination Preference Attribute for BGP

2023-08-18 Thread Jakob Heitz (jheitz) via NANOG
That's true Robert.
However, communities and med only work with neighbors.
Communities routinely get scrubbed because they cause increased memory usage 
and convergence time in routers.
Even new path attributes get scrubbed, because there have been bugs related to 
new ones in the past.
Here is a config snippet in XR

router bgp 23456
attribute-filter group testAF
  attribute unrecognized discard
!
neighbor-group testNG
  update in filtering
   attribute-filter group testAF

The only thing that has any chance to go multiple ASes is as-path.
Need to be careful with that too because long ones get dropped.

route-policy testRP
  if as-path length ge 200 then
drop
  endif
end-policy

Kind Regards,
Jakob


From: Robert Raszuk 
Date: Friday, August 18, 2023 at 12:38 AM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org 
Subject: Re: Destination Preference Attribute for BGP
Jakob,

With AS-PATH prepend you have no control on the choice of which ASN should do 
what action on your advertisements.

However, the practice of publishing communities by (some) ASNs along with their 
remote actions could be treated as an alternative to the DPA attribute. It 
could result in remote PREPEND action too.

If only those communities would not be deleted by some transit networks 

Thx,
R.

On Thu, Aug 17, 2023 at 9:46 PM Jakob Heitz (jheitz) via NANOG 
mailto:nanog@nanog.org>> wrote:
"prepend as-path" has taken its place.

Kind Regards,
Jakob


Date: Wed, 16 Aug 2023 21:42:22 +0200
From: Mark Tinka 

On 8/16/23 16:16, michael brooks - ESC wrote:

> Perhaps (probably) naively, it seems to me that DPA would have been a
> useful BGP attribute. Can anyone shed light on why this RFC never
> moved beyond draft status? I cannot find much information on this
> other than IETF's data tracker
> (https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-dpa/) and RFC6938
> (which implies DPA was in use,?but then was deprecated).

I've never heard of this draft until now, but reading it, I can see why
it would likely not be adopted today (not sure what the consensus would
have been back in the '90's).

DPA looks like MED on drugs.

Not sure operators want remote downstream ISP's arbitrarily choosing
which of their peering interconnects (and backbone links) carry traffic
from source to them. BGP is a poor communicator of bandwidth and
shilling cost, in general. Those kinds of decisions tend to be locally
made, and permitting outside influence could be a rather hard sell.

It reminds me of how router vendors implemented GMPLS in the hopes that
optical operators would allow their customers to build and control
circuits in the optical domain in some fantastic fashion.

Or how router vendors built Sync-E and PTP into their routers hoping
that they could sell timing as a service to mobile network operators as
part of a RAN backhaul service.

Some things just tend to be sacred.

Mark.


Re: Destination Preference Attribute for BGP

2023-08-17 Thread Jakob Heitz (jheitz) via NANOG
"prepend as-path" has taken its place.

Kind Regards,
Jakob


Date: Wed, 16 Aug 2023 21:42:22 +0200
From: Mark Tinka 

On 8/16/23 16:16, michael brooks - ESC wrote:

> Perhaps (probably) naively, it seems to me that DPA would have been a
> useful BGP attribute. Can anyone shed light on why this RFC never
> moved beyond draft status? I cannot find much information on this
> other than IETF's data tracker
> (https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-dpa/) and RFC6938
> (which implies DPA was in use,?but then was deprecated).

I've never heard of this draft until now, but reading it, I can see why
it would likely not be adopted today (not sure what the consensus would
have been back in the '90's).

DPA looks like MED on drugs.

Not sure operators want remote downstream ISP's arbitrarily choosing
which of their peering interconnects (and backbone links) carry traffic
from source to them. BGP is a poor communicator of bandwidth and
shilling cost, in general. Those kinds of decisions tend to be locally
made, and permitting outside influence could be a rather hard sell.

It reminds me of how router vendors implemented GMPLS in the hopes that
optical operators would allow their customers to build and control
circuits in the optical domain in some fantastic fashion.

Or how router vendors built Sync-E and PTP into their routers hoping
that they could sell timing as a service to mobile network operators as
part of a RAN backhaul service.

Some things just tend to be sacred.

Mark.



Re: Best Linux (or BSD) hosted BGP?

2023-05-03 Thread Jakob Heitz (jheitz) via NANOG
I just checked the Cisco IOS-XR code. It's not vulnerable to any of the 3 flaws 
listed in the below linked hackernews article.

Kind Regards,
Jakob


Date: Wed, 3 May 2023 12:52:46 +0300
From: Hank Nussbacher 

On 02/05/2023 17:56, Warren Kumari wrote:

For those that like FRR:
https://thehackernews.com/2023/05/researchers-uncover-new-bgp-flaws-in.html

Regards,
Hank



RE: Large prefix lists/sets on IOS-XR

2022-12-09 Thread Jakob Heitz (jheitz) via NANOG
Sander,

How big? How slow?
You can reply to me off or on list.

About 8 to 10 years ago, we had a large effort to improve this.
Now customers push many megabytes of prefix-sets several times a day and it 
works.
I have sent some questions internally to get a better answer.

Related, in 7.2.1, we added the as-set, which allows you to filter BGP routes 
by origin-as.
It is similar to as-path-set.
as-path-set is slow, using a linear lookup, but it is versatile, allowing 
ios-regex.
as-set can only use numbers and only for origin-as, but it is fast using a 
log(N) lookup.

Regards,
Jakob.

-Original Message-
Date: Fri, 9 Dec 2022 00:02:52 +0100
From: Sander Steffann 

Hi,

What is the best/most efficient/most convenient way to push large prefix lists 
or sets to an XR router for BGP prefix filtering? Pushing thousands of lines 
through the CLI seems foolish, I tried using the load command but it seems 
horribly slow. What am I missing? :)

Cheers!
Sander

---
for every complex problem, there?s a solution that is simple, neat, and wrong



RE: Understanding impact of RPKI and ROA on existing advertisements

2022-11-03 Thread Jakob Heitz (jheitz) via NANOG
There are a lot of ROAs out there that make it EASIER to hijack
a route rather than harder.

If you register an ROA for a route and also advertise that route
in BGP, then an attacker who prepends your ASN has to at least
compete with your route with an AS_PATH length and will lose
in most of the Internet (but not all of it).

However, if you don't advertise the route, then the attacker has nothing
to compete with and his prepended route will be accepted as RPKI valid
everywhere.

Remember max_length in a ROA. All routes covered by that max_length
will be considered valid by RPKI if the origin ASN matches.
If you don't advertise them all, then you are just making it
EASIER for an attacker to hijack them.

For example if you have an ROA for 10.1.0.0/16, max_length 17,
that includes the routes:
10.1.0.0/16
10.1.0.0/17
10.1.128.0/17

If you don't advertise all those routes in BGP, they are open
to being hijacked and considered RPKI valid.

OTOH, if you register the ROA as 10.1.0.0/16 max_length 16,
then anyone who tries to advertise 10.1.0.0/17 will have
their advertisement rejected as RPKI invalid.

I'm aware that people create ROAs for more specifics in case
they need to advertise them to break a hijack.
But then the hijacker could just advertise the longest prefix
allowed by the ROA. You can't break that with a yet more specific.
Unless the user of the route is not validating with RPKI.

It's a conundrum.

Regards,
Jakob.



Re: any dangers of filtering every /24 on full internet table to preserve FIB space ?

2022-10-12 Thread Jakob Heitz (jheitz) via NANOG
Here is a reason you might want to keep that /24.

Suppose you are a small ISP and I am your customer.
I also have another larger provider.
That larger provider is also your provider.
I own a /21 and advertise it to my larger provider.
You get that /21 from my larger provider.
I advertise a /24 subset of the /21 to you.
If you ignore my /24, then traffic for it goes
to the larger provider and I pay him for the traffic, not you.

Regards,
Jakob.



RE: Newbie x Cisco IOS-XR x ROV: BCP to not harassing peer(s)

2022-05-24 Thread Jakob Heitz (jheitz) via NANOG
This attack will work very well until the victim starts advertising
its prefix. The victim may not notice the fake advertisement because the fake
advertisement will not reach the victim AS due to AS-path loop checking.

So potential victims must advertise all prefixes that they register in
RPKI or subscribe to an Internet monitoring service to detect the
fake advertisements.

And don't forget maxlen. You must advertise in BGP every prefix
covered by maxlen.

Regards,
Jakob.

-Original Message-
From: Saku Ytti 

On Tue, 24 May 2022 at 11:23, Max Tulyev  wrote:

> To make a working hijack of the routed prefix (for sniffing traffic,
> DDoS or something similar), you have to announce a more specific
> prefix(es). It can be denied by RPKI.
>
> If you signed RPKI prefix is still unannounced - yes, somebody can
> hijack it by forging the origin ASN - that's quite easy.

This axiomatically assumes first come, first serve, which is obviously
not complete understanding of BGP best path algorithm.

-- 
  ++ytti



RE: Newbie x Cisco IOS-XR x ROV: BCP to not harassing peer(s)

2022-05-15 Thread Jakob Heitz (jheitz) via NANOG
Saku,

You have two questions. I'll address the second one first.

Beginning in IOS-XR 7.3.1, there is a new O(log n) scalable way to test for 
autonomous system numbers (ASN) in route-policy. ASNs can be grouped into an 
as-set as follows:

as-set foo
  64496,
  64497
end-set
!
route-policy bar
  if not as-path originates-from foo then
drop
  endif
  pass
end-policy

The first question:
If you use several tests in your route-policy and put the validation-state
test last, then any route that gets dropped before the validation-state
test is reached will not be saved with 
"soft-reconfiguration inbound RPKI-tested-only".
For example:

route-policy bar
  if not as-path originates-from foo then
drop
  endif
  if validation-state is invalid then
drop
  endif
  pass
end-policy

Regards,
Jakob.

-Original Message-
From: Saku Ytti  
Sent: Saturday, May 14, 2022 12:09 AM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org
Subject: Re: Newbie x Cisco IOS-XR x ROV: BCP to not harassing peer(s)

On Sat, 14 May 2022 at 00:17, Jakob Heitz (jheitz)  wrote:

Hey Jakob,

> 'RPKI-tested-only' will store all routes that encounter a 'validation-state' 
> test
> in the inbound route policy. In that case, when an RPKI server updates a VRP 
> to the
> router, it can re-run the inbound policy from the stored route and not 
> require a
> refresh request to be sent.

> 'RPKI-dropped-only' causes the dropped routes to be stored. This will prevent
> the unnecessary route-refreshes described above. It does not prevent all
> route-refreshes, but uses significantly less memory than 'RPKI-tested-only'

I'm sorry, but I am unable to reason what these answers mean in
context of question that was:


---
if validation-state is valid then
  pass
else
  drop


I am assuming

a) RPKI-tested-only would be same normal, and keep every single route
b) RPKI-dropped-only would not keep anything (but it also might keep
everything and be same as a))

That is, in this specific scenario, as far as I understand, there is
no effect on the optimisations.



Just to clarify why this type of policy may not be insane. IOS-XR has
a 300k prefix limit for prefix-set, this limit is regularly hit by low
quality as-set. By low quality I mean almost all as-set expand to
unnecessarily large prefix-set, because as-set tend to be 'add only',
there is no incentive to remove, so they just grow over time and do
not represent in a meaningful manner the set of prefixes neighbours
might advertise.
And if we abstract what we the operators are actually doing, no one is
doing prefix filtering, what everyone does is build AS-tree, by
starting recursion from some as-set. So this AS-tree is the source of
truth, no prefixes at all, prefixes are almost incentidental. After we
have this AS-tree, we flatten it and for each element we ask for a
route object with that origin. And then send this list to routers.

Understanding what we actually do here, offers a mechanism for config
size reduction as well as a standardized way to programmatically
deploy those prefix-lists, by (ab)using RTR for this. We can fill all
gaps from IRR data that RPKI data leaves us, then send a complete set
of DFZ origins to routers, this allows us to accept only valid
prefixes. Further the as-graph we created and flattened we can
implement per-neighbour, which is trivial size compared to prefix-set
size.

Now compiling those AS path filters are regexp may not be so cheap,
but some NOS offer cheap way to implement such AS filtering at scale:
https://www.juniper.net/documentation/us/en/software/junos/routing-policy/topics/topic-map/Improve-as-path-lookup.html

If we do this 100% complete RTR and AS-set filter per neighbor, then
we actually have better routing security than we have in the most
canonical way, because we are enforcing origin:prefix relation, which
we are not enforcing when we dump larger and low quality prefix-sets
to routers. This makes us much less vulnerable to the low quality
as-set both in operational manner by not inflating config sizes and
cause commits to fail and by improving routing security.


>
> Regards,
> Jakob.
>
> -Original Message-
> From: Saku Ytti 
> Sent: Friday, May 13, 2022 12:36 AM
> To: Jakob Heitz (jheitz) 
> Cc: nanog@nanog.org
> Subject: Re: Newbie x Cisco IOS-XR x ROV: BCP to not harassing peer(s)
>
> On Fri, 13 May 2022 at 00:44, Jakob Heitz (jheitz) via NANOG
>  wrote:
>
> > RPKI-dropped-only
> > Saves a copy of only the routes dropped by an RPKI validation-state test in 
> > neighbor-in route-policy.
> >
> > RPKI-tested-only
> > Saves a copy of only the routes tested in an RPKI validation-state test in 
> > neighbor-in route-policy.
>
> What does this mean? If any term refers to validation-state, the route
> gets stored?
>
> Eg.
>
> if validation-state is valid then
>   pass
> else
>   drop
>
>
> a) Would 'RPKI-dropped-only' store everything or nothing?
> b) Would 'RPKI-tested-only' store everything?
>
> --
>   ++ytti



--
  ++ytti


RE: Newbie x Cisco IOS-XR x ROV: BCP to not harassing peer(s)

2022-05-13 Thread Jakob Heitz (jheitz) via NANOG
'RPKI-tested-only' will store all routes that encounter a 'validation-state' 
test
in the inbound route policy. In that case, when an RPKI server updates a VRP to 
the
router, it can re-run the inbound policy from the stored route and not require a
refresh request to be sent.

This option saves memory if you use a coarse filter in the route-policy before
the validation test. For example, you use a peer-locking filter to drop peer
routes from your customers before they hit the validation-state test. Then
a massive route leak won't chew up soft-reconfiguration memory.

If a validation-state test drops a route and that route is not stored by
soft-reconfiguration, then when the RPKI server updates any VRP, the router
needs to send a route-refresh request.

'RPKI-dropped-only' causes the dropped routes to be stored. This will prevent
the unnecessary route-refreshes described above. It does not prevent all
route-refreshes, but uses significantly less memory than 'RPKI-tested-only'

Regards,
Jakob.

-Original Message-
From: Saku Ytti  
Sent: Friday, May 13, 2022 12:36 AM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org
Subject: Re: Newbie x Cisco IOS-XR x ROV: BCP to not harassing peer(s)

On Fri, 13 May 2022 at 00:44, Jakob Heitz (jheitz) via NANOG
 wrote:

> RPKI-dropped-only
> Saves a copy of only the routes dropped by an RPKI validation-state test in 
> neighbor-in route-policy.
>
> RPKI-tested-only
> Saves a copy of only the routes tested in an RPKI validation-state test in 
> neighbor-in route-policy.

What does this mean? If any term refers to validation-state, the route
gets stored?

Eg.

if validation-state is valid then
  pass
else
  drop


a) Would 'RPKI-dropped-only' store everything or nothing?
b) Would 'RPKI-tested-only' store everything?

-- 
  ++ytti


RE: Newbie x Cisco IOS-XR x ROV: BCP to not harassing peer(s)

2022-05-12 Thread Jakob Heitz (jheitz) via NANOG
To address the risk of somebody exhausting your memory by dumping a ton of 
routes on you,
we added two new options to "soft-reconfiguration inbound" in IOS-XR.

RPKI-dropped-only
Saves a copy of only the routes dropped by an RPKI validation-state test in 
neighbor-in route-policy.

RPKI-tested-only
Saves a copy of only the routes tested in an RPKI validation-state test in 
neighbor-in route-policy.

This was released in 7.3.1 in Feb 2021.

The bug CSCwb17937 was fixed in 7.5.2, just released. Fixed a few other things 
in 7.5.2 also.
Tomoya, apologies that you had a terrible time with it.


Regards,
Jakob.

-Original Message-
Date: Wed, 11 May 2022 14:31:28 -0700
From: Randy Bush 
To: Pirawat WATANAPONGSE via NANOG 
Subject: Re: Newbie x Cisco IOS-XR x ROV: BCP to not harassing peer(s)
and upstream(s)
Message-ID: 
Content-Type: text/plain; charset=US-ASCII

> Is setting 'Soft Reconfiguration' enough for me to keep ROV running?

yes, should be.

> If not, is there any other solution?

yes.  jakob says he has implemented
https://datatracker.ietf.org/doc/draft-ietf-sidrops-rov-no-rr/, though i
do not known in what xr image(s)

randy



Re: Need for historical prefix blacklist (`rogue' prefixes)

2021-10-31 Thread Jakob Heitz (jheitz) via NANOG
It may be possible to create a fake certificate for a fake ROA.
However, to do that requires a lot of steps to go right.

First, the RSA private key needs to be derived from the public key.
The quantum computer physics exists to do it.
However, the known technology is massively behind and may never materialize.
OTOH, it is a wide open field and someone may find a way to create enough
qubits and entangle them all and keep them stable long enough to
perform the calculation tomorrow.
People have been trying for several years, so this is extremely unlikely.

Second, relying parties need to be convinced/tricked into downloading
the fake certificates. Since each certificate contains the publication points
of its child certificates, the certs are chained together.
The route to a publication point needs to be hacked to cause relying parties
to access the fake publication point.

A point was made that encrypted data can be captured and stored and then
be decrypted later once the technology becomes available. This possibility
is not useful for creating fake ROA certs.

Therefore quantum resistant certificates will not be needed in advance of
the development of quantum certificate crackers.

Regards,
Jakob.

-Original Message-
Date: Sat, 30 Oct 2021 19:57:25 -0500
From: "J. Hellenthal" 

He answered it completely. "You" worried about interception of RPKI exchange 
over the wire are failing to see that there is nothing there important to 
decrypt because the encryption in the transmission is not there !

And yet you've failed to even follow up to his question... "What's your point 
regarding your message? ROV does not use (nor needs) encryption."

So maybe you could give some context on that so someone can steer you out of 
the wrong direction.

-- 
 J. Hellenthal

The fact that there's a highway to Hell but only a stairway to Heaven says a 
lot about anticipated traffic volume.

> On Oct 30, 2021, at 10:31, A Crisan  wrote:
> 
> ?
> Hi Matthew, 
> 
> Quantum computing exists as POCs, IBM being one of those advertising them and 
> announced to extend their project. There are others on the market, Amazon 
> advertised quantum computing as a service back in 2019: 
> https://www.theverge.com/2019/12/2/20992602/amazon-is-now-offering-quantum-computing-as-a-service.
>  The bottle neck of the current technology is scalability: we will not see QC 
> as personal computing level just yet (to go in more detail, current 
> technologies work at cryogenic temperatures, thus they are hyper expensive 
> and not really scalable), but they exist and one could be imagine they 
> are/will be used for various tasks.
> 
> On the other hand, you've actually commented every word of my mail, minus the 
> stated question. Thanks. 
> 
> Best Regards, 
> Dora Crisan 
> 
> 
> 
>  
> 
>> On Fri, Oct 29, 2021 at 8:10 PM Matthew Walster  wrote:
>> 
>> 
>>> On Fri, 29 Oct 2021, 15:55 A Crisan,  wrote:
>>> Hi Matthew,
>>> I was reading the above exchange, and I do have a question linked to your 
>>> last affirmation. To give you some context, the last 2021 ENISA report seem 
>>> to suggest that internet traffic is "casually registered" by X actors to 
>>> apply post Retrospective decryption (excerpt below). This would be at odds 
>>> with your (deescalating) affirmation that hijacks are non-malicious and 
>>> they are de-peered quickly, unless you pinpoint complete flux arrest only. 
>>> Are there any reportings/indicators... that look into internet flux 
>>> constant monitoring capabilities/capacities? Thanks.
>> 
>> 
>> RPKI uses authentication not confidentiality. There is no encryption taking 
>> place, other than the signatures on the certificates etc.
>> 
>>> Excerpt from the introduction: "What makes matters worse is that any cipher 
>>> text intercepted by an attacker today can be decrypted by the attacker as 
>>> soon as he has access to a large quantum computer (Retrospective 
>>> decryption).
>> 
>> 
>> Which do not exist (yet).
>> 
>>> Analysis of Advanced Persistent Threats (APT) and Nation State capabilities,
>> 
>> 
>> Buzzwords.
>> 
>>> along with whistle blowers? revelations
>> 
>>>  have shown that threat actors can and are casually recording all Internet 
>>> traffic in their data centers
>> 
>> 
>> No they're not. It's just not possible or indeed necessary to duplicate 
>> everything at large scale. Perhaps with a large amount of filtering, certain 
>> flows would be captured, but in the days of pervasive TLS, this seems less 
>> and less worthwhile.
>> 
>>>  and that they select encrypted traffic as interesting and worth 
>>> storing.This means that any data encrypted using any of the standard 
>>> public-key systems today will need to be considered compromised once a 
>>> quantum computer exists and there is no way to protect it retroactively, 
>>> because a copy of the ciphertexts in the hands of the attacker. This means 
>>> that data that needs to remain confidential after the arrival of quantum 
>>> computers need 

RE: "Tactical" /24 announcements

2021-08-17 Thread Jakob Heitz (jheitz) via NANOG
Oh, and your other issue. IOS-XR has two modes in which you can use
RPKI validity. One is where the router automatically uses the
validity. The other mode is where you use the validity in any
way you want in route-policy.

Regards,
Jakob.

-Original Message-
From: Jakob Heitz (jheitz) 
Sent: Tuesday, August 17, 2021 9:59 AM
To: nanog@nanog.org
Subject: RE: "Tactical" /24 announcements

> RPKI validity cover is incomplete.
One way: add your own RTR records. They don't all have to come from
the RPKI.
Another way: Add route-policy to validate the origin-as.
That requires a prefix-set. However, these prefix-sets are much smaller
and the sum of them is smaller than the sum of prefix-sets you would
use on your neighbor sessions.

Regards,
Jakob.

-Original Message-
Date: Tue, 17 Aug 2021 09:22:01 +0300
From: Saku Ytti 

I share your confusion Randy. It seems like perhaps Jakob answered a
slightly different question and his answer is roughly.

a) Use this as-set feature to ensure valid set of ASNs from given peer
b) Validate prefix using RPKI (I'm assuming with rejecting unknowns
and invalids)
c) Don't punch in prefix-lists anywhere

Which in theory works, but in practice it does not, as RPKI validity
cover is incomplete.

Somewhat related, when JNPR implemented RTR the architecture was
planned so that the RTR implementation itself isn't tightly coupled to
RPKI validity. It was planned day1 that customers could have multiple
RTR setups feeding prefixes and the NOS side could use these for other
purposes too. So technically JNPR is mostly missing CLI work to allow
you to feed prefix-lists dynamically over RTR, instead of punching
them in vendor-specific way in config.

I really hope JNPR does that work, I really like the appeal of doing
things off-box and using the same protocol to talk to on-box. Also,
give me gRPC/protobuf route policy API, so I can write my route-policy
in a real programming language once for all my NOS.


On Mon, 16 Aug 2021 at 20:32, Randy Bush  wrote:
>
> hi jakob,
>
> i am confused between
>
> > There is no expansion to prefix-set.
>
> and your earlier
>
> >> We have introduced the scalable as-set into the XR route policy language.
> >> as-path-set does not scale well with 1000's of ASNs.
> >> Now, you don't need to expand AS-SET into prefix-set, just enter it 
> >> directly.
>
> expanding AS-SET into prefix filters is exactly what we do.
>
> ```
> % peval -s RIPE AS-RG-SEA
> ({198.180.153.0/24, 198.180.151.0/24, 147.28.8.0/24, 147.28.9.0/24, 
> 147.28.10.0/24, 147.28.11.0/24, 147.28.12.0/24, 147.28.13.0/24, 
> 147.28.14.0/24, 147.28.15.0/24, 147.28.4.0/24, 147.28.5.0/24, 147.28.6.0/24, 
> 147.28.7.0/24, 147.28.2.0/24, 147.28.3.0/24, 147.28.0.0/23, 45.132.188.0/24, 
> 45.132.189.0/24, 45.132.190.0/24, 45.132.191.0/24})
> ```
>
> i do not see how to get around this.  clue bat please
>
> randy



-- 
  ++ytti


RE: "Tactical" /24 announcements

2021-08-17 Thread Jakob Heitz (jheitz) via NANOG
> RPKI validity cover is incomplete.
One way: add your own RTR records. They don't all have to come from
the RPKI.
Another way: Add route-policy to validate the origin-as.
That requires a prefix-set. However, these prefix-sets are much smaller
and the sum of them is smaller than the sum of prefix-sets you would
use on your neighbor sessions.

Regards,
Jakob.

-Original Message-
Date: Tue, 17 Aug 2021 09:22:01 +0300
From: Saku Ytti 

I share your confusion Randy. It seems like perhaps Jakob answered a
slightly different question and his answer is roughly.

a) Use this as-set feature to ensure valid set of ASNs from given peer
b) Validate prefix using RPKI (I'm assuming with rejecting unknowns
and invalids)
c) Don't punch in prefix-lists anywhere

Which in theory works, but in practice it does not, as RPKI validity
cover is incomplete.

Somewhat related, when JNPR implemented RTR the architecture was
planned so that the RTR implementation itself isn't tightly coupled to
RPKI validity. It was planned day1 that customers could have multiple
RTR setups feeding prefixes and the NOS side could use these for other
purposes too. So technically JNPR is mostly missing CLI work to allow
you to feed prefix-lists dynamically over RTR, instead of punching
them in vendor-specific way in config.

I really hope JNPR does that work, I really like the appeal of doing
things off-box and using the same protocol to talk to on-box. Also,
give me gRPC/protobuf route policy API, so I can write my route-policy
in a real programming language once for all my NOS.


On Mon, 16 Aug 2021 at 20:32, Randy Bush  wrote:
>
> hi jakob,
>
> i am confused between
>
> > There is no expansion to prefix-set.
>
> and your earlier
>
> >> We have introduced the scalable as-set into the XR route policy language.
> >> as-path-set does not scale well with 1000's of ASNs.
> >> Now, you don't need to expand AS-SET into prefix-set, just enter it 
> >> directly.
>
> expanding AS-SET into prefix filters is exactly what we do.
>
> ```
> % peval -s RIPE AS-RG-SEA
> ({198.180.153.0/24, 198.180.151.0/24, 147.28.8.0/24, 147.28.9.0/24, 
> 147.28.10.0/24, 147.28.11.0/24, 147.28.12.0/24, 147.28.13.0/24, 
> 147.28.14.0/24, 147.28.15.0/24, 147.28.4.0/24, 147.28.5.0/24, 147.28.6.0/24, 
> 147.28.7.0/24, 147.28.2.0/24, 147.28.3.0/24, 147.28.0.0/23, 45.132.188.0/24, 
> 45.132.189.0/24, 45.132.190.0/24, 45.132.191.0/24})
> ```
>
> i do not see how to get around this.  clue bat please
>
> randy



-- 
  ++ytti


RE: "Tactical" /24 announcements

2021-08-16 Thread Jakob Heitz (jheitz) via NANOG
Saku,

The feature is in 7.2.1. The documentation has not made it to the
command reference.

There is no expansion to prefix-set. The command checks the origin-AS
in the route. You should confirm the origin-AS with the prefix
using RPKI and/or another route-policy statement.
This way the final route-policy configuration will be much smaller.

I'm happy to answer more questions or requests for improvement
on or off list.

Regards,
Jakob.

-Original Message-
From: Saku Ytti  
Sent: Saturday, August 14, 2021 11:11 PM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org
Subject: Re: "Tactical" /24 announcements

Hey Jakob,

Is there documentation for this somewhere? Are you saying that the
IOS-XR host will connect to some (configured?) server to expand the
as-set, and at what time? Commit time? Once every N?

On Sun, 15 Aug 2021 at 04:50, Jakob Heitz (jheitz) via NANOG
 wrote:
>
> Ytti,
>
> We have introduced the scalable as-set into the XR route policy language.
> as-path-set does not scale well with 1000's of ASNs.
> Now, you don't need to expand AS-SET into prefix-set, just enter it directly.
> Example:
> as-set test
>   2914,
>   3356,
> end-set
> !
> route-policy sample
>   if as-path originates-from test then
> pass
>   endif
> end-policy
>
> If this does not meet your needs and you need improvements, let me know.
>
> Kind Regards,
> Jakob.
>
> -
> Date: Mon, 9 Aug 2021 19:10:23 +0300
> From: Saku Ytti 
>
> We just recently learned of a IOS-XR prefix-set limit of 31 when a
> particular customer AS-SET expanded to a higher number of prefixes.
>
> --
>   ++ytti
>


-- 
  ++ytti


RE:"Tactical" /24 announcements

2021-08-14 Thread Jakob Heitz (jheitz) via NANOG
Ytti,

We have introduced the scalable as-set into the XR route policy language.
as-path-set does not scale well with 1000's of ASNs.
Now, you don't need to expand AS-SET into prefix-set, just enter it directly.
Example:
as-set test
  2914,
  3356,
end-set
!
route-policy sample
  if as-path originates-from test then
pass
  endif
end-policy

If this does not meet your needs and you need improvements, let me know.

Kind Regards,
Jakob.

-
Date: Mon, 9 Aug 2021 19:10:23 +0300
From: Saku Ytti 

We just recently learned of a IOS-XR prefix-set limit of 31 when a
particular customer AS-SET expanded to a higher number of prefixes.

-- 
  ++ytti



Re: Can somebody explain these ransomwear attacks?

2021-06-26 Thread Jakob Heitz (jheitz) via NANOG
Finding vulnerabilities and how to exploit them to run malware
in closed source code is nigh on impossible. 
Anyone can read open source code.

What is possible is to analyze patches to figure out what was fixed
and then to attack those that didn't apply the patches.

Even easier is old releases. Patches often have more than one fix,
but a patch for an old release is almost guaranteed to be a fix
for a single vulnerability. That makes it easier to analyze.

Regards,
Jakob.



Re: A survey on BGP MRAI timer values in practice

2021-06-09 Thread Jakob Heitz (jheitz) via NANOG
In Cisco, MRAI is "advertisement-interval".
MRAI helps to reduce route update multiplication in highly redundant
networks. OTOH, it can increase the time it takes to re-advertise
a complete internet table in some router implementations.
Update multiplication due to redundant network connections causes
some receivers of the multiple updates to become slow peers.

Here's an experiment: Do something to cause a BGP route refresh, like
the equivalent of "clear bgp soft out". It will not change any routes.
It just resends everything that was already sent. See how long it takes
with MRAI=0. Then set MRAI to about half of that value and do the
refresh again. If it takes substantially longer to complete the refresh,
stick with MRAI=0.
If there is no significant difference, use MRAI of 1 or 2 seconds.

Regards,
Jakob.

-Original Message-
Date: Wed, 9 Jun 2021 08:53:19 +0300
From: Saku Ytti 

On Wed, 9 Jun 2021 at 01:18, Adam Thompson  wrote:

If your work results in actionable recommendations such as "don't use BGP
> out-delay timers to mitigate XYZ in circumstance LMNO, do ABC instead",
> that's fantastic.  Please keep us advised, and do post aggregated survey
> results here once you close the survey.
>

What is actionable? What is the goal? The question as OP presented contains
some assumptions

a) better convergence is needed
b) MRAI is important part of the solution space

Neither are provable. We already know how to make DFZ convergence really
fast (or at least orders of magnitude faster than it is), that information
exists, but that isn't deployed because customers are not asking for it, so
providers are not aware that there is room for improvements.

Things don't optimise to be as good as they can be, things optimise to be
as bad as the market allows them to be. And the market accepts the DFZ
convergence.

If you do decide to optimise for DFZ convergence, without commercial
pressure, you will risk lower availability, because you'll be using
configuration less tested by other customers and everyone knows how
terrible quality every NOS is. Everyone finds novel bugs, in the same damn
protocols we've ran +20 years. It's like running Windows and Linux and
regularly finding out listing files in a directory breaks your service,
year after year after year.

For those who are interested in better convergence
   - change your interface down reporting to 0 (there may be delay before
interface down is reported to system, so that optical protection works
without causing outage)
   - use 'add-path' or at least 'best-external' in iBGP, so that you always
have backup eBGP route immediately available once best is invalidated
(normally you have lot of delay to find next best, once you lose your best
eBGP)
   - tie your route validity to IGP, so you can invalidate your BGP the
moment IGP disappears
   - ensure IGP converges fast (another topic)
   - set MRAI to 0
   - use PIC edge
   - ensure your BGP NLRI can be as large as MTU allows
   - ensure your convergence isn't bottle necked by slow peer in group
   - ensure you are not dropping received TCP packets on punt path
   - ensure your fast external fallover works (eBGP down, on int down) this
is quite easy to break
   - then ensure everyone else in the DFZ does the same thing



But from a business POV, don't do any of this, you will have more bugs and
lower availability and your customers will be less happy.



> I *am *specifically interested in the answer to "Have you ever had to
> adjust BGP out-delay with any of your peers, and why?"  It would be great
> if we could derive that answer from the survey results, but anecdotal
> replies here would also be helpful.  All you larger(-than-me) network
> operators out there: when would I need to use out-delay?  Why?  What does
> it accomplish?
>
> Good luck in reformulating your survey to get better engagement,
> -Adam
>
> *Adam Thompson*
> Consultant, Infrastructure Services
> [image: 1593169877849]
> 100 - 135 Innovation Drive
> Winnipeg, MB, R3T 6A8
> (204) 977-6824 or 1-800-430-6404 (MB only)
> athomp...@merlin.mb.ca
> www.merlin.mb.ca
>
> --
> *From:* NANOG  on behalf
> of Saku Ytti 
> *Sent:* June 8, 2021 01:06
> *To:* shahr...@cs.umass.edu 
> *Cc:* nanog list ; Arun Venkataramani 
> *Subject:* Re: A survey on BGP MRAI timer values in practice
>
> On Mon, 7 Jun 2021 at 19:32,  wrote:
>
> > We often read that the Internet (i.e. BGP) has a long convergence delay.
> > But why is it so slow? And can we (researchers) do anything about it?
>
> Create business incentives to improve it. This is a non-technical
> problem, we've long had technical tools to make it fast, there just
> isn't incentive to make it fast. Customers are not asking operators
> for better convergence speeds.
>
> > Please help us out to find out by answering our short anonymous survey
> > (<10 minutes).
>
> Can you tell me what have you done so far? What are the default MRAI
> values for each AFI/SAFI for IOS, IOS-XR, 

RE: BGP and The zero window edge

2021-04-21 Thread Jakob Heitz (jheitz) via NANOG
I'd like to get some data on what actually happened
in the real cases and analyze it.

If it's a Cisco router at fault, then we have a bug to fix.
Even if it's not a Cisco, there may be ways we can help
to avoid the situation.
However, before we start on solutions, I'd like to get
a good understanding of what actually happened.

TCP zero window is possible, but many other things could
cause it too.

Anyone?

Regards,
Jakob.

-Original Message-
From: Job Snijders  
Sent: Wednesday, April 21, 2021 2:11 PM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org
Subject: Re: BGP and The zero window edge

Dear Jakob, group,

On Wed, Apr 21, 2021 at 08:59:06PM +, Jakob Heitz (jheitz) via NANOG wrote:
> Ben's blog details an experiment in which he advertises routes and then
> withdraws them, but some of them remain stuck for days.
> 
> I'd like to get to the bottom of this problem.

I think there are *two* problems:

1) some BGP implementations (or multi-node BGP configurations) sometimes
   end up getting stuck in one way or another.

2) other BGP nodes are not able to disconnect/reconnect to systems
   suffering from instantiations of problem #1.

While on the one hand it is important to follow-up on each and every
instantiation of problem #1, I personally think it also is worthwhile
exploring whether the BGP FSM itself can be redefined in a way that
encourages BGP protocol implementations to be more robust and rely less
on the remote peer behaving correctly.

Once Problem #2 is addressed, finding and isolating instances of Problem
#1 will become much easier.

> Has anyone else seen this before or can provide data to analyze?
> On or off list.

>From the BGP Default-Free Zone perspective it is hard to differentiate
between an entire (multi-vendor) Autonomous System being stuck, or just
one router.

To test individual router implementations this tool is useful
https://github.com/benjojo/bgp-zerowindow-test - but please keep in mind
that "TCP Recv Wind == 0" trick is just one way to easily get a BGP peer
to manifest the problematic behavior.

>From a BGP protocol perspective BGP nodes shouldn't inspect the TCP
receive window, but rather focus on whether all locally available
signals indicate that the remote peer is still progressing data.

Kind regards,

Job


RE: BGP and The zero window edge

2021-04-21 Thread Jakob Heitz (jheitz) via NANOG
Ben's blog details an experiment in which he advertises routes and then
withdraws them, but some of them remain stuck for days.

I'd like to get to the bottom of this problem.

Has anyone else seen this before or can provide data to analyze?
On or off list.

Regards,
Jakob.

-Original Message-
Date: Wed, 21 Apr 2021 07:31:10 -0400
From: "Jean St-Laurent" 

Nice article explaining a specific BGP corner case not removing routes when
TCP window reaches 0.

https://blog.benjojo.co.uk/post/bgp-stuck-routes-tcp-zero-window

The proposed solution is a new RFC for BGP with the suggestion to introduce
a new timer.

Fascinating!

Jean St-Laurent /CISSP
ddosTest me security inc
site:? https://ddostest.me 


RE: NANOG Digest, Vol 157, Issue 3

2021-02-03 Thread Jakob Heitz (jheitz) via NANOG
I couldn't put down Bill Norton's book.
https://drpeering.net/core/bookOutline.html
When a cheapskate like me pays the $10, it means something.

Regards,
Jakob.

-Original Message-
Date: Tue, 2 Feb 2021 11:35:34 +0100
From: Casey Callendrello 
To: nanog@nanog.org
Subject: BGP / routing paper recommendations?
Message-ID: <5f47b46e-d43b-a60e-42ef-84149a5a6...@caseyc.net>
Content-Type: text/plain; charset=utf-8; format=flowed

Hi all,

I'm part of a paper reading club, and the group's interest has turned to 
BGP and Internet routing in general. As the only person in the group who 
even knows what an AS is, I've been tasked with finding interesting 
papers on the subject. Any papers or presentations that you found 
valuable or interesting?

My list, so far:

- ARTEMIS: Neutralizing BGP Hijacking Within a Minute 
(https://www.inspire.edu.gr/wp-content/pdfs/artemis_TON2018.pdf)
- Securing BGP - A Literature Survey 
(https://ieeexplore.ieee.org/document/5473881)
- Stable Internet routing without global coordination 
(https://ieeexplore.ieee.org/document/974523)
- A Survey on Approaches to Reduce BGP Interdomain Routing Convergence 
Delay on the Internet (https://ieeexplore.ieee.org/document/7964680)

I would particularly appreciate papers that focus on the 
distributed-systems aspect of routing, such as convergence times, 
stability, and security. Happy to take responses off-list, will 
summarize in due time.

TIA,
-- Casey Callendrello


RE: Summary: advertise-peer-as

2021-01-28 Thread Jakob Heitz (jheitz) via NANOG
Jared,

Agreed it's "interesting".
Please configure "as-path-loopcheck out disable" under bgp address family to 
make it less interesting.
https://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k-r7-1/routing/command/reference/b-routing-cr-asr9000-71x/b-routing-cr-asr9000-71x_chapter_01.html#wp3145726977

Regards,
Jakob.

-Original Message-
From: Jared Mauch 

I was also told there?s some interesting behavior in IOS-XR, if a peer is in 
the same peer-group and based on the order of the peers coming up, the behavior 
may be different where routes may be suppressed.

- jared



Re: A study on community-triggered updates in BGP

2020-10-21 Thread Jakob Heitz (jheitz) via NANOG
Thomas,

I confirmed your case and took a look at the code.
The outbound duplicate suppression function tries to detect
duplicates without actually storing or recreating the
previously sent update, so it misses some cases.

Your use case is a good one. We will check to see if we can
detect it without compromising significantly on resource usage.
Thank you for raising the issue.

Regards,
Jakob.

-Original Message-
Date: Tue, 20 Oct 2020 04:48:37 -0700
From: Thomas Krenc 

Hi Jakob.

The simple configuration below allows communities to be forwarded
(send-community-ebgp), but are cleaned at egress (using route-policy and
community-set).

In the experiment, the router receives announcements with altering
community attributes only, from the internal peer. After the filter is
applied, the router sends duplicates to the external peer.

Also, In a slightly different setup, the router sends duplicates due to
changes in the next-hop only.

best regards
Thomas

---

RP/0/0/CPU0:ios(config)#show running-config
Tue Oct 20 02:56:24.230 UTC
Building configuration...
!! IOS XR Configuration 6.0.1
!! Last configuration change at Tue Oct 20 02:56:02 2020 by cisco
!
interface MgmtEth0/0/CPU0/0
?shutdown
!
interface GigabitEthernet0/0/0/0
?ipv4 address 10.12.0.2 255.255.255.252
!
interface GigabitEthernet0/0/0/1
?ipv4 address 10.20.0.1 255.255.255.252
!
community-set all
? *:*
end-set
!
route-policy nofilter
? pass
end-policy
!
route-policy egressfilter
? delete community in all
? pass
end-policy
!
router bgp 65002
?bgp router-id 10.12.0.2
?address-family ipv4 unicast
!
?neighbor 10.12.0.1
? remote-as 65001
? address-family ipv4 unicast
?? send-community-ebgp
?? route-policy egressfilter out
!
?neighbor 10.20.0.2
? remote-as 65002
? address-family ipv4 unicast
!
end

On 10/17/20 3:59 PM, Jakob Heitz (jheitz) via NANOG wrote:
> IOS-XR has duplicate update suppression logic for EBGP sessions,
> not for IBGP sessions.
>
> If you are using EBGP and seeing a fault in the duplicate update
> suppression logic in IOS-XR, please let me know configs and details
> of the experiment.
>
> Regards,
> Jakob.
>
> -Original Message-
> Date: Thu, 15 Oct 2020 18:35:58 -0700
> From: Thomas Krenc 
>
> Dear NANOG,
>
> As a team of researchers from NPS and TU Berlin, we are investigating
> the impact of BGP community attributes on the update behavior between ASes.
>
> We find that when a route is associated with multiple distinct community
> attributes it does not only lead to multiple announcement at the tagging
> AS, but also at neighboring ASes, if communities are not filtered
> properly. This behavior is wide-spread.
>
> In order to better understand our observations, we have performed a
> series of laboratory experiments using Cisco IOS, Junos OS, as well as
> the BIRD daemon.
>
> We find that - by default - all tested routers generate announcements
> with changing community attributes, even when other attributes do not
> change. In addition, when communities are filtered at egress, Cisco und
> BIRD send duplicate announcements (Juniper does not).
>
> Since our findings are limited to observations in public data as well as
> few router implementations, we would like to share our research and
> kindly ask you to have a look at:
>
> ??? https://www.cmand.org/communityexploration/
>
> There, we provide some resources documenting our research, as well as
> open questions. We greatly appreciate any feedback and insights you can
> offer. Also, please don't hesitate to contact us directly:
>
> ??? communityexploration AT cmand DOT org
>
> best regards
>
> Thomas Krenc
> Postdoctoral Researcher
> Naval Postgraduate School


Re: A study on community-triggered updates in BGP

2020-10-18 Thread Jakob Heitz (jheitz) via NANOG
This feature suppresses outgoing duplicates. Another feature ignores incoming 
duplicates from any BGP session.

Regards,
Jakob.


> On Oct 18, 2020, at 1:46 AM, Clemens Mosig  wrote:
> 
> On 18.10.20 00:59, Jakob Heitz (jheitz) via NANOG wrote:
>> IOS-XR has duplicate update suppression logic for EBGP sessions,
>> not for IBGP sessions.
> 
> Does this feature hinder incoming duplicates from triggering best path 
> selection or does it stop the export of duplicates?
> 
> Cheers,
> Clemens


A study on community-triggered updates in BGP

2020-10-17 Thread Jakob Heitz (jheitz) via NANOG
IOS-XR has duplicate update suppression logic for EBGP sessions,
not for IBGP sessions.

If you are using EBGP and seeing a fault in the duplicate update
suppression logic in IOS-XR, please let me know configs and details
of the experiment.

Regards,
Jakob.

-Original Message-
Date: Thu, 15 Oct 2020 18:35:58 -0700
From: Thomas Krenc 

Dear NANOG,

As a team of researchers from NPS and TU Berlin, we are investigating
the impact of BGP community attributes on the update behavior between ASes.

We find that when a route is associated with multiple distinct community
attributes it does not only lead to multiple announcement at the tagging
AS, but also at neighboring ASes, if communities are not filtered
properly. This behavior is wide-spread.

In order to better understand our observations, we have performed a
series of laboratory experiments using Cisco IOS, Junos OS, as well as
the BIRD daemon.

We find that - by default - all tested routers generate announcements
with changing community attributes, even when other attributes do not
change. In addition, when communities are filtered at egress, Cisco und
BIRD send duplicate announcements (Juniper does not).

Since our findings are limited to observations in public data as well as
few router implementations, we would like to share our research and
kindly ask you to have a look at:

??? https://www.cmand.org/communityexploration/

There, we provide some resources documenting our research, as well as
open questions. We greatly appreciate any feedback and insights you can
offer. Also, please don't hesitate to contact us directly:

??? communityexploration AT cmand DOT org

best regards

Thomas Krenc
Postdoctoral Researcher
Naval Postgraduate School


RE: Juniper configuration recommendations/BCP

2020-10-13 Thread Jakob Heitz (jheitz) via NANOG
IOS-XR accepts extended communities and large communities by default.
You have to enable to send them, but not receive.

Regards,
Jakob.

-Original Message-
Date: Mon, 12 Oct 2020 15:06:05 +0100
From: 

Here's a fun one.
By default Junos accepts extended communities on any BGP session (not just
on MP-BGP sessions like it's the default case on cisco -unless explicitly
enabled).
Since most operators are not aware of this default Junos behaviour, one can
be importing routes to interesting places if one were so inclined.  

-so yeah bleach unwanted communities on ingress (bleach those that would
interfere with the ones used by the AS internally -so called
"untaggable"/"untouchable" ).  

adam

> -Original Message-
> From: NANOG  bounces+adamv0025=netconsultings@nanog.org> On Behalf Of
> Chriztoffer Hansen
> Sent: Thursday, October 8, 2020 11:05 AM
> To: nanog@nanog.org
> Subject: Juniper configuration recommendations/BCP
> Importance: Low
> 
> 
> On 08/10/2020 11:37, Forrest Christian (List Account) wrote:
> > Is there anything I should worry about which is Juniper-specific?
> 
> JUNOS default ARP timeout: 20 min.
> 
> If you connect to IXP's. Recommended ARP timeout: 4 hours.



RE: Issue with Noction IRP default setting (Was: BGP route hijack by AS10990)

2020-08-04 Thread Jakob Heitz (jheitz) via NANOG
I was made aware of another bug in IOS-XR: CSCuv94859. Thanks Job and Ryan.
It caused some routes with NO_EXPORT to sometimes be advertised to EBGP
after an NSR switchover during a software upgrade.
It was fixed in 2015.

Regards,
Jakob.

-Original Message-
From: Jakob Heitz (jheitz) 
Sent: Tuesday, August 4, 2020 10:24 AM
To: nanog@nanog.org
Subject: Re: Issue with Noction IRP default setting (Was: BGP route hijack by 
AS10990)

CSCdj01351. Fixed in 1997.

Regards,
Jakob.

-Original Message-
Date: Sat, 1 Aug 2020 13:29:59 -0700
From: Ryan Hamel 

...
Also, wasn't it you that said Cisco routers had a bug in ignoring NO_EXPORT?
...


Re: Issue with Noction IRP default setting (Was: BGP route hijack by AS10990)

2020-08-04 Thread Jakob Heitz (jheitz) via NANOG
CSCdj01351. Fixed in 1997.

Regards,
Jakob.

-Original Message-
Date: Sat, 1 Aug 2020 13:29:59 -0700
From: Ryan Hamel 

...
Also, wasn't it you that said Cisco routers had a bug in ignoring NO_EXPORT?
...


RE: Don Smith, RIP.

2020-07-24 Thread Jakob Heitz (jheitz) via NANOG
Don was a great guy. I learnt a few things about Flowspec from him.
Sorry to see him go.

Regards,
Jakob.

-Original Message-

Date: Thu, 23 Jul 2020 23:22:45 +
From: "Dobbins, Roland" 

It is with a heavy heart that I must relate the news that Don Smith, formerly 
of CenturyLink and more lately of Netscout Arbor, passed away in his sleep last 
night.

Don was a colleague, friend, and mentor to many; he was a mainstay of the 
operational community, and tirelessly worked to make the Internet safer and 
more resilient for us all.  His intellect, wit, and generosity of spirit were 
well-known to those who were privileged to have the opportunity to work with 
and learn from him.

Don?s contributions to the industry were manifold.  While we are all diminished 
by his loss, his legacy abides; and we can honor him by continuing to build 
upon that foundation, for the betterment of the Internet community as a whole.

Once Don?s family have established plans for his memorial, they will be posted 
here.

Roland Dobbins 




Re: Partial vs Full tables

2020-06-08 Thread Jakob Heitz (jheitz) via NANOG
These are the first steps to optimization. Hysteresis is another.
They work in ideal cases. However, when coding, we need to consider
all cases. How do you set the timer?
The timer has to anticipate the future. It's like an automatic
transmission. It can't anticipate the future. However, the worst
that can happen if the automatic transmission anticipates
incorrectly is that it hunts.

Regards,
Jakob.

-Original Message-
Date: Mon, 8 Jun 2020 10:14:17 +0200
From: Baldur Norddahl 

On 08.06.2020 07.56, Jakob Heitz (jheitz) via NANOG wrote:
> FIB compression comes with some risks.
> When routes churn, there are certain cases when you have to decompress the 
> FIB.
> Then, the FIB must have the space, or else OOPS.
> If a set of compressed routes has to change to decompress some and compress a
> different set to improve overall compression, there is a lot of FIB
> programming going on. This can cause very long convergence times.
>

The easy solution is to introduce some delay before programming the FIB. 
Or even process RIB updates as a separate thread, such that the FIB 
update thread does not try to program every step the RIB might go 
through. Instead the FIB update thread would take a snapshot of where we 
are now and where do we want to be and only program the diff.

Given the concept is a smaller FIB size, this might actually end up 
being less FIB programming.

Regards,

Baldur



Re: Partial vs Full tables

2020-06-07 Thread Jakob Heitz (jheitz) via NANOG
FIB compression comes with some risks.
When routes churn, there are certain cases when you have to decompress the FIB.
Then, the FIB must have the space, or else OOPS.
If a set of compressed routes has to change to decompress some and compress a
different set to improve overall compression, there is a lot of FIB
programming going on. This can cause very long convergence times.
Because a FIB memory cell can not forward and be programmed at the same time,
forwarding takes preference and programming speed suffers.
FIB programming is the slowest part of convergence, the bottleneck.
If routes also have a backup path loaded, then the backup nexthops also
need to be the same in order to compress. During convergence, not all
routes change at the same time and there could be some very uncompressible
transient route sets during convergence.
Some possible sequences of compress/decompress during convergence could
cause a lot of churn in FIB programming.
This presents lots of opportunities for optimization and thus bugs.

Regards,
Jakob.



RE: attribution

2020-04-17 Thread Jakob Heitz (jheitz) via NANOG
From version 6.3.1, IOS XR supports "if community length" in route-policy.

Regards,
Jakob.

-Original Message-
Date: Fri, 17 Apr 2020 12:29:33 +0100
From: 

On the point of as-path length limit, Yes I know of at least one tier-1 that 
does it and since I left some 8 years back I do it everywhere I go.
In addition to the above (best common practice, id' say)
-on junos you can do community length limiting
-and on cisco you can do attribute filtering  -hence my question to this forum 
some time back about whether folks do filter all the "experiments" for the sake 
of running a successful business (paraphrasing...)

adam



RE: Route aggregation w/o AS-Sets

2020-04-15 Thread Jakob Heitz (jheitz) via NANOG
Sorry, I did not intend to imply that you were.
I should have prefaced my post with "to add".

Regards,
Jakob.

From: Matthew Petach 
Sent: Wednesday, April 15, 2020 4:29 PM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org
Subject: Re: Route aggregation w/o AS-Sets


I apologize if I wasn't clear.

I don't recommend ever using AS_SET.

So, in rule 3, I use the atomic-aggregate knob
to announce the single covering aggregate with
my backbone ASN as the atomic-aggregate origin
AS, and I don't generate or propagate any AS_SET
information along with the aggregate.

That way, no loop is seen by any of the downstream
networks that are announced the aggregate prefix.

I hope that helps clear up what I meant in my third
rule.  :)

Thanks!

Matt


On Wed, Apr 15, 2020 at 11:26 AM Jakob Heitz (jheitz) via NANOG 
mailto:nanog@nanog.org>> wrote:
Suppose you had a set of customers than all announced to you a set of routes
and all those routes complete an aggregate
and you announce only the aggregate to those customers
and you include an AS_SET with it
then those customers will drop your aggregate, thinking there is an AS-loop
and those customers will not be able to reach each other.

An AS_SET does not prevent routing loops and can prevent correct routing.

But you must include the ATOMIC_AGGREGATE attribute, so that someone else
does not disaggregate your aggregate that does not have the AS_SET.

Regards,
Jakob.

-Original Message-
Date: Tue, 14 Apr 2020 02:32:37 -0700
From: Matthew Petach mailto:mpet...@netflight.com>>

I generally would use the atomic-aggregate knob to
generate aggregate routes for blocks I controlled,
when the downstream ASN information was not
necessary to propagate outside my network
(usually cases where I had multiple internal ASNs,
but all connectivity funneled though a single upstream pathway.)

If you have discrete downstream ASNs with potentially
different external pathways, you shouldn't be generating
aggregate routes that cover them; that's just bad routing 101.

Thus, my rules for aggregation always came down to:
1) is there more than one external/upstream pathway for the ASN and prefix?

If so, don't aggregate.
2) is redundant, reliable connectivity between all the external gateway
routers that would be announcing the aggregate?
If not, don't generate a covering aggregate.
3) If there's only a single upstream pathway through you for the ASN and
prefix,
and that won't be changing any time soon (eg, you have a collection of
downstream
datacenter with their own ASNs and prefixes, but they all route through a
common
backbone), then use the atomic-aggregate option to suppress the more
specific
AS_PATH information, and simply announce the space as a single aggregate
coming
from your backbone ASN.

That way, there's no confusion with RPKI and AS_SETS; all you're ever
announcing
are simple AS_PATHs for a given prefix.

Best of luck!

Matt



RE: Route aggregation w/o AS-Sets

2020-04-15 Thread Jakob Heitz (jheitz) via NANOG
Suppose you had a set of customers than all announced to you a set of routes
and all those routes complete an aggregate
and you announce only the aggregate to those customers
and you include an AS_SET with it
then those customers will drop your aggregate, thinking there is an AS-loop
and those customers will not be able to reach each other.

An AS_SET does not prevent routing loops and can prevent correct routing.

But you must include the ATOMIC_AGGREGATE attribute, so that someone else
does not disaggregate your aggregate that does not have the AS_SET.

Regards,
Jakob.

-Original Message-
Date: Tue, 14 Apr 2020 02:32:37 -0700
From: Matthew Petach 

I generally would use the atomic-aggregate knob to
generate aggregate routes for blocks I controlled,
when the downstream ASN information was not
necessary to propagate outside my network
(usually cases where I had multiple internal ASNs,
but all connectivity funneled though a single upstream pathway.)

If you have discrete downstream ASNs with potentially
different external pathways, you shouldn't be generating
aggregate routes that cover them; that's just bad routing 101.

Thus, my rules for aggregation always came down to:
1) is there more than one external/upstream pathway for the ASN and prefix?

If so, don't aggregate.
2) is redundant, reliable connectivity between all the external gateway
routers that would be announcing the aggregate?
If not, don't generate a covering aggregate.
3) If there's only a single upstream pathway through you for the ASN and
prefix,
and that won't be changing any time soon (eg, you have a collection of
downstream
datacenter with their own ASNs and prefixes, but they all route through a
common
backbone), then use the atomic-aggregate option to suppress the more
specific
AS_PATH information, and simply announce the space as a single aggregate
coming
from your backbone ASN.

That way, there's no confusion with RPKI and AS_SETS; all you're ever
announcing
are simple AS_PATHs for a given prefix.

Best of luck!

Matt



RE: Practical guide to predicting latency effects?

2020-04-08 Thread Jakob Heitz (jheitz) via NANOG
My data point:

I'm working from home. My computer is connected through company VPN, over wifi 
to Comcast.
Comcast speed test says 18mS.
I use VNC and Webex with voice and video through the computer.
VNC response time and voice delay is not noticeable.

Regards,
Jakob.

-Original Message-
Date: Tue, 7 Apr 2020 22:52:18 +
From: Adam Thompson 

I’m looking for a practical guide – i.e. specifically NOT an academic paper, 
thanks anyway – to predicting the effect of increased (or decreased) latency on 
my user’s applications.

Specifically, I want to estimate how much improvement there will be in 
{bandwidth, application XYZ responsiveness, protocol ABC goodput, whatever} if 
I decrease the RTT between the user and the server by 10msec, or by 20msec, or 
by 40msec.

My googling has come up with lots of research articles discussing theoretical 
frameworks for figuring this out, but nothing concrete in terms of a calculator 
or even a rule-of-thumb.

Ultimately, this goes into MY calculator – we have the usual north-american 
duopoly on last-mile consumer internet here; I’m connected directly to only one 
of the two.  There’s a cost $X to improve connectivity so I’m peered with both, 
how do I tell if it will be worthwhile?

Anyone got anything at all that might help me?

Thanks in advance,
-Adam

Adam Thompson
Consultant, Infrastructure Services
[[MERLIN LOGO]]
100 - 135 Innovation Drive
Winnipeg, MB, R3T 6A8
(204) 977-6824 or 1-800-430-6404 (MB only)
athomp...@merlin.mb.ca
www.merlin.mb.ca



RE: China’s Slow Transnational Network

2020-03-03 Thread Jakob Heitz (jheitz) via NANOG
I can corroborate that. I visited China in August 2019 and had terrible 
internet performance to sites outside of China. This was both with mobile and 
wifi at the homes of two friends, one in Heilongjiang and the other in Beijing. 
When I visited in February 2015, it was much better. Both times, I was using 
VNC on the company VPN. This does not use much bandwidth, but is quite latency 
sensitive.

-Original Message-
Date: Sun, 1 Mar 2020 21:00:05 -0800
From: Pengxiong Zhu 

Hi all,

We are a group of researchers at University of California, Riverside who
have been working on measuring the transnational network performance (and
have previously asked questions on the mailing list). Our work has now led
to a publication in Sigmetrics 2020 and we are eager to share some
interesting findings.

We find China's transnational networks have extremely poor performance when
accessing foreign sites, where the throughput is often persistently
low (e.g., for the majority of the daytime). Compared to other countries we
measured including both developed and developing, China's transnational
network performance is among the worst (comparable and even worse than some
African countries).

Measuring from more than 400 pairs of mainland China and foreign nodes over
more than 53 days, our result shows when data transferring from foreign
nodes to China, 79% of measured connections has throughput lower than the
1Mbps, sometimes it is even much lower. The slow speed occurs only during
certain times and forms a diurnal pattern that resembles congestion
(irrespective of network protocol and content), please see the following
figure. The diurnal pattern is fairly stable, 80% to 95% of the
transnational connections have a less than 3 hours standard deviation of
the slowdown hours each day over the entire duration. However, the speed
rises up from 1Mbps to 4Mbps in about half an hour.


We are able to confirm that high packet loss rates and delays are incurred
in the foreign-to-China direction only. Moreover, the end-to-end loss rate
could rise up to 40% during the slow period, with ~15% on average.

There are a few things noteworthy regarding the phenomenon. First of all,
all traffic types are treated equally, HTTP(S), VPN, etc., which means it
is discriminating or differentiating any specific kinds of traffic. Second,
we found for 71% of connections, the bottleneck is located inside China
(the second hop after entering China or further), which means that it is
mostly unrelated to the transnational link itself (e.g., submarine cable).
Yet we never observed any such domestic traffic slowdowns within China.
Assuming this is due to congestion, it is unclear why the infrastructures
within China that handles transnational traffic is not even capable to
handle the capacity of transnational links, e.g., submarine cable, which
maybe the most expensive investment themselves.

Here is the link to our paper:
https://www.cs.ucr.edu/~zhiyunq/pub/sigmetrics20_slowdown.pdf

We appreciate any comments or feedback.
-- 

Best,
Pengxiong Zhu
Department of Computer Science and Engineering
University of California, Riverside


RE: Starting to Drop Invalids for Customers

2020-02-03 Thread Jakob Heitz (jheitz) via NANOG
Lukas,

CSCvc84848

Will keep you in the loop too, Lukas.

Regards,
Jakob.

-Original Message-
From: Lukas Tribus  
Sent: Monday, February 3, 2020 12:43 AM
To: Mark Tinka ; Jakob Heitz (jheitz) 
Cc: nanog@nanog.org
Subject: Re: Starting to Drop Invalids for Customers

Hello,

On Tue, 14 Jan 2020 at 07:21, Mark Tinka  wrote:
>  On 13/Jan/20 21:53, Jakob Heitz (jheitz) wrote:
> > Mark,
> >
> > Thanks for bringing this up again.
> > I remember this from nearly 3 years ago when Randy brought it up.
> > A bug was filed, but it disappeared in the woodwork.
> > I have now given it the high priority tag that it should have had initially.
> > Sorry about the mess up.
>
> Many thanks, Jakob, for bumping this. Much appreciated, as I was
> dreading running this through my account team :-).
>
> Most grateful if you can keep us (or me, whichever you prefer) posted on
> the progress of this fix. I am willing to test code to verify things.

I'm also very interested to follow the progress here. Is there a BugID
you guys can share?


Thank you,

Lukas


Re: Starting to Drop Invalids for Customers

2020-01-13 Thread Jakob Heitz (jheitz) via NANOG
Mark,

Thanks for bringing this up again.
I remember this from nearly 3 years ago when Randy brought it up.
A bug was filed, but it disappeared in the woodwork.
I have now given it the high priority tag that it should have had initially.
Sorry about the mess up.

In the meantime, you may be able to signal the validation state in iBGP
once it is validated at the network edge.
For an iBGP neighbor, use a configuration like this:
   neighbor 192.0.2.1 announce rpki state

Regards,
Jakob.

Date: Sat, 11 Jan 2020 23:12:27 +0200
From: Mark Tinka 


On 10/Jan/20 16:15, Lukas Tribus wrote:

> Thanks for sharing all this. Regarding those 2 platforms specifically,
> what release are you using here that does not blow up?

On the ASR920, we are on 16(11)01a.

On the ME3600X, we are on 15.6(2)SP6.


>  IIRC you had
> some RPKI related crash bugs at some point in time?

Yes, that was the first time we were deploying RPKI in 2014 and the code
back then crashed the ME3600X.

No such problem this time around.


> - there is no ROA, so prefixes are supposed to be UNKNOWN on all nodes
> - but IOS-XE prefers VALID over UNKNOWN (changing best path selection)
> - iBGP is *always* VALID (even if it's really UNKNOWN), eBGP is
> showing UNKNOWN, so iBGP is preferred over eBGP which breaks a lot of
> assumptions and "hot potato" concepts (possible temporary routing
> loops, other than of course different egress behavior)

So your timing on this is ominous.

In the last day or so, we had an issue with a customer on one of our
ASR1006 edge routers that fell victim to this IOS XE stupidity. An
alternate path toward them learned from a peer was sent back to the edge
router they are connected to, which chose it over the local one because,
well, it was an iBGP route. We didn't notice this issue with this
customer since enabling ROV on this box weeks ago, which means the
alternate route became available in the last 2 - 3 days, e.g., perhaps
they turned up an alternative provider, or changed their routing toward
them for us to see another path.

Since this IOS XE stupidity is not configurable, what we've decided to
do is disable ROV on all ASR1006 boxes for now. This is not a big issue
for us. We've only got 2 customers using them as these boxes only carry
non-Ethernet customers.

While this should be an issue for our ASR920 and ME3600X routers also,
it isn't because we run BGP-SD on those, i.e., even if the RIB will have
all iBGP routes marked as Valid, they won't be installed in FIB, whereas
the eBGP routes learned locally from the customer will.

Having to create a ROA to solve this, while feasible, is inappropriate
for a solution, especially when Juniper do it correctly.

Randy and I complained to Cisco about this years back, and AFAIK, it was
only fixed in IOS XR. That this is still going on in 2020 is silly,
especially when it's clear that they are in violation of the RFC.


> Apparently there is an IOS feature "Announce RPKI Validation State to
> Neighbors" to transmit the *real* RPKI state in iBGP (so as opposed to
> defaulting to VALID for all iBGP neighbors), I'm not sure if that
> fixes this problem or not. It doesn't really address the root cause
> (which is: unwanted and not configurable interference with the best
> path selection algorithm) - but at it can at least hide it's symptoms.

I've not tried communicating RPKI state between routers via BGP communities.

One of the reasons I like RPKI is because it is a feature that works on
each router independent of another. Each router has a discrete RTR
session to a validator, and can make its own RPKI decisions without any
regard for the rest of the network. And yet all routers in the network
can do this and equally have a converged RPKI state, without ever
speaking to one another.

So the idea of having routers co-ordinate RPKI information through
communities is one I am not so keen on, if I'm honest. Not only do you
need to worry about inter-op issues between vendors, there is potential
for problems when code changes over time. I'd rather not deal with that,
especially since what Cisco are doing with IOS XE is simply a broken
implementation.

That said, if there is anyone out there who has done this and sees it as
a solution to the problem, I'm sure this list would like to hear about it.


> RPKI implementations should not touch best path selection. Dropping
> RPKI invalids is the real use-case here, and if someones wants to
> loc-pref based on RPKI status we should allow it (even if it doesn't
> make a lot of sense), but having the RPKI implementation intervene in
> the best path selection without the possibility to disable it is ...
> frustrating.

Agreed.

At least, if IOS XE had a knob that could "set rpki
[valid|notfound|invalid]" this would somewhat help. But alas, they don't
:-(.

You can only match on existing RPKI state. You cannot manually set RPKI
state in IOS XE routing policy. I mean, how dumb is that?

It's pretty presumptuous of Cisco to automatically apply 

RE: fuzzy subnet aggregation

2019-10-30 Thread Jakob Heitz (jheitz) via NANOG
Another thing to consider is how long it takes to download into forwarding 
hardware.
Forwarding hardware is optimized for forwarding, not programming.
The programming has to wait for time slots when forwarding is not using the 
memory.

When you do smart aggregation, a single changed route could cause a massive
change in your aggregates. Then the resulting download could be both long
and cause glitches. Glitches, because you remove some aggregates while adding
others within a finite time. During this finite time, you may have incorrect
routing.

Regards,
Jakob.



RE: BGP over TLS

2019-10-21 Thread Jakob Heitz (jheitz) via NANOG
The article linked says no mainstream BGP implementation supports TCP-AO.
IOS-XE and IOS-XR support it.

While I do not represent the Cisco view, personally I like the idea of BGP over 
TLS.

Regards,
Jakob.

-Original Message-

Date: Mon, 21 Oct 2019 19:21:03 +1100
From: Julien Goodwin 


On 21/10/19 6:30 pm, Bjørn Mork wrote:
> Christopher Morrow  writes:
> 
>> isn't julien's idea more akin to DOT then DOH ?
> 
> Yes, and I really like Julien's proposal.  It even looks pretty
> complete.  There are just a few details missing around how to make the
> MD5 => TLS transition smooth.

At least for those systems that run on Linux (which is most all of the
major's except Juniper) I suspect if we went to the relevant kernel folk
with a clear plan on how handling TCP-MD5 in a way that would make
transitions much easier they'd listen.

The troll response at the top of my post was actually based on a
response from one of the kernel folk, who dislike TCP options even more
than network operators.

> Sorry for any confusion caused by an attempt to make a joke on DoH.  I
> didn't anticipate the sudden turn to serious discussion :-)  Which
> obviously was a good one.  I am all for BGP over TLS, so let's discuss
> https://laptop006.livejournal.com/60532.html

If anyone is at all interested in this I'm happy to discuss and flesh
out anything that's not clear. After I wrote this (over a few bottles of
red on the flight to linux.conf.au this year) I sent it to a bunch of
people that had expressed interest, including a few BGP implementations,
but nobody bit.


Re: syn flood attacks from NL-based netblocks

2019-08-20 Thread Jakob Heitz (jheitz) via NANOG
The source address in the SYN is spoofed. What if the real owner of the source 
address wanted to connect to you? Then your penaltybox would block him. An 
attacker could now use your penaltybox to cause a DoS to the real owner of the 
IP address.

> Date: Sun, 18 Aug 2019 08:48:08 -0700
> From: Mike 
> 
> My idea is to maintain a penaltybox for any client IP that initiated a
> connection but did not complete, while also maintaining a whitelist of
> 'frequent fliers' who have previously completed their connections
> successful. The penalty could simply be to drop traffic sourced from
> those client ips that do not complete the handshake, for some
> configurable timeout period. The whitelisting feature could give a pass
> to good clients and allow these to bypass the penalty filtering, for a
> longer timeout period (but of course, passing it along so other ACL's
> can take effect). I'd say, perhaps, a 5 minute timeout would be
> sufficient for a penalty, while 1 day or longer would be sufficient for
> whitelisting. It would depend on your traffic of course, and definitely
> you would want something efficient such as linux ipset as opposed to
> individual iptables rules.
> 
> While looking around, I came across the SYNPROXY netfilter module.. it
> appears to be very complete but missing the above functionality to avoid
> responding to spoofed clients. I'm going to see about hacking up a proof
> of concept. I'll post here if I come up with something to play with.
> 
> Mike-


Re: Networks enforcing RPKI validation

2019-06-09 Thread Jakob Heitz (jheitz) via NANOG
Job,

Let me know if you have any issues doing this with IOS-XR.

Regards,
Jakob.

Date: Fri, 7 Jun 2019 17:29:49 +0200
From: Job Snijders 
To: Eric Dugas 
Cc: NANOG 
Subject: Re: Networks enforcing RPKI validation
Message-ID: <20190607152949.gc32...@hanna.meerval.net>
Content-Type: text/plain; charset=us-ascii

Point of clarificartion: NTT is not there yet, but we are on our way.
NTT does not yet apply RFC 6811 Origin Validation on its EBGP session
and does not yet reject RPKI Invalid BGP announcements.



Re: Analysing traffic in context of rejecting RPKI invalids

2019-03-14 Thread Jakob Heitz (jheitz) via NANOG
If at least one ROA matches a route, then the route is valid.
This is to cover the case when more than one AS is authorized to
originate a particular prefix.

https://tools.ietf.org/html/rfc6811
Page 5:
   o  NotFound: No VRP Covers the Route Prefix.

   o  Valid: At least one VRP Matches the Route Prefix.

   o  Invalid: At least one VRP Covers the Route Prefix, but no VRP
  Matches it.

BTW, this rule allows you to issue a ROA authorizing AS0 to originate
your complete address space.

Why would you do that? Suppose you own an address space, but you only
want to announce a portion of it to the internet. If you issue a ROA
for the unannounced portion authorizing your own ASN, then an attacker
can announce that portion and prepend your ASN. The attacker can thus
hijack your unannounced space and appear valid by RPKI!

To prevent that, you issue a ROA for your complete address space
authorizing AS0 and your BGP announced space authorizing your own ASN.

An AS path containing AS0 is malformed, being treated as withdraw.
https://tools.ietf.org/html/rfc7607

Regards,
Jakob.

-Original Message-
Date: Wed, 13 Mar 2019 11:17:22 -0400
From: Steve Meuse 
>
>
> Thanks for the update, but based on that description I'm not certain
> that you implemented the same thing that pmacct built, which IMO is
> what is needed by those considering deploying a drop-invalids policy.
> (Perhaps you omitted mentioning that ability in your description but
> included it in your implementation.)
>
>
Thanks Jay, you are correct. As we were talking through the logic we
realized we missed that bit. Internally, we're working though the logic to
understand if there is a covering route, is that route valid, and if not,
will we recurse and look for another covering route that is valid?

Either way, we'll be updating our software with that functionality shortly.

-Steve



RE: Cisco ASR's with RSP440 engines...

2019-02-20 Thread Jakob Heitz (jheitz) via NANOG
Wh! Thanks man!

Jakob.

-Original Message-

Date: Tue, 19 Feb 2019 15:26:38 +
From: Tom Hill 

On 18/02/2019 21:50, John Von Essen wrote:
> If anyone on here has experience with the ASR series running the
> RSP440-SE or -TR, please contact me off-list. I'm trying to better
> understand real world performance when it comes to handling a few full
> BGP tables on these, it would be running as very basic edge router
> primarily just doing BGP. I know the RSP440 is EOL, but the plan would
> be to upgrade to RSP880 within a year.

The 440 is a beast. Faster even than the 9001's RP. You'll be fine with
a LOT of BGP edge work. :)

The RSP880 is faster on paper, but I'll be impressed if you notice a
difference over the 440 in terms of solely basic BGP edge functions. It
of course has support for other things that you might need, however.

(No idea why this would need to be offlist...)

-- 
Tom



Re: BGP Experiment

2019-01-25 Thread Jakob Heitz (jheitz) via NANOG
It does, Ytti. And not just in testing. In feature development too.
Often in design discussions, someone pipes up: "someone does bla bla,
Let's not break it". One I remember from years ago was setting two
route reflectors as clients of each other and thinking route reflection
wasn't designed for that. It's being aware of such customer "creativity"
that keeps us on our toes.

Regards,
Jakob.

-Original Message-
From: Saku Ytti 

Lot of vendor, maybe all, accept your configuration and test them for
releases. I think this is only viable solution vendors have for
blackbox, gather configs from customers and test those, instead of try
to guess what to test.
I've done that with Cisco in two companies, unfortunately I can't
really tell if it impacted quality, but I like to think it did.


-- 
  ++ytti


Re: Reaching out to ARIN members about their RPKI INVALID prefixes

2018-09-19 Thread Jakob Heitz (jheitz) via NANOG
Owen,

You are correct in that RPKI leaves many problems unsolved.

One that it does solve is prefix splitting.
If I issue a ROA for prefix 10.1.2.0/23, any announcement of 10.1.2.0/24 
(including mine) will be declared INVALID, because that announcement is covered 
by the ROA and the mask length is longer than maxlen.

Of course, as you rightly point out, if I do NOT announce that prefix myself, 
then anyone is free to announce it anywhere and have it declared VALID just by 
prepending my ASN.

Regards,
Jakob.

-Original Message-
Date: Tue, 18 Sep 2018 14:18:55 -0700
From: Owen DeLong 

What does RPKI offer other than a way to know what to spoof in a prepend for 
your forged announcement?


RE: Confirming source-routed multicast is dead on the public Internet

2018-08-02 Thread Jakob Heitz (jheitz) via NANOG
You could put this multicast receiver into the last hop before the customer
and then send unicast to the customer.

Regards,
Jakob.


-Original Message-
From: Saku Ytti  
Sent: Thursday, August 2, 2018 2:45 PM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org
Subject: Re: Confirming source-routed multicast is dead on the public Internet

On Fri, 3 Aug 2018 at 00:42, Saku Ytti  wrote:

> Cute :). Well 8*bitrates, but nice optimisation to make stream count
> finite. Of course at cost of quality, as receiver needs up-speed of 8x
> at start. Interesting side-effect, quality increases as movie
> progresses :)

I may have worded up-speed potentially ambiguously, I mean over-speed,
meaning access needs higher than stream bitrate to receive stream of
specific bitrate. In practical world, of course already very
problematic scenario for most NFLX consumers.

-- 
  ++ytti


RE: Confirming source-routed multicast is dead on the public Internet

2018-08-02 Thread Jakob Heitz (jheitz) via NANOG
ok. Play 2 minutes of ads at the start and save a stream.
Play another 2 minutes of ads every 16 minutes, then the maximum number of 
streams is 4.
The ads can be received in a single stream or be received after the shorter 
streams have completed.

Regards,
Jakob.


-Original Message-
From: Saku Ytti  
Sent: Thursday, August 2, 2018 2:42 PM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org
Subject: Re: Confirming source-routed multicast is dead on the public Internet

Hey,

On Fri, 3 Aug 2018 at 00:36, Jakob Heitz (jheitz) via NANOG
 wrote:

> Hey, there's a better way.
> Split the movie into segments:
> Segment 1: Minute 1.
> Segment 2: Minute 2.
> Segment 3: Minutes 3,4.
> Segment 4: Minutes 5-8.
> Segment 5: Minutes 9-16.
> etc.
> Then send each segment in a loop.
> Each receiver receives every loop simultaneously.
> Each segment may start receiving part way through, but then it starts again.
> By the time a segment needs to play, it is completely received.
> A 128 minute movie needs 8 streams.
> While waiting for the first minute, you can play ads :)
> The shorter segments don't need to be sent for long:
> Receivers can stop receiving the short segments once they have received one 
> loop of it.
> When no receiver is receiving a loop, you can stop sending it.

Cute :). Well 8*bitrates, but nice optimisation to make stream count
finite. Of course at cost of quality, as receiver needs up-speed of 8x
at start. Interesting side-effect, quality increases as movie
progresses :)
-- 
  ++ytti


Re: Confirming source-routed multicast is dead on the public Internet

2018-08-02 Thread Jakob Heitz (jheitz) via NANOG
Hey, there's a better way.
Split the movie into segments:
Segment 1: Minute 1.
Segment 2: Minute 2.
Segment 3: Minutes 3,4.
Segment 4: Minutes 5-8.
Segment 5: Minutes 9-16.
etc.
Then send each segment in a loop.
Each receiver receives every loop simultaneously.
Each segment may start receiving part way through, but then it starts again.
By the time a segment needs to play, it is completely received.
A 128 minute movie needs 8 streams.
While waiting for the first minute, you can play ads :)
The shorter segments don't need to be sent for long:
Receivers can stop receiving the short segments once they have received one 
loop of it.
When no receiver is receiving a loop, you can stop sending it.

Regards,
Jakob.


-Original Message-
Date: Wed, 1 Aug 2018 19:24:21 +0300
From: Saku Ytti 

Imagine someone like youtube or netflix would like to use multicast,
instead of caches. They'd need to start new multicast stream for every
content with small delay (to get more viewers on given stream), how
much delay would consumer tolerate before content starts? 1min? 5min?
So every minute or every 5 minute new stream of movie would be sent,
except it would need to be sent many times, for each bitrate
supported.


RE: Segment Routing

2018-05-22 Thread Jakob Heitz (jheitz)
Nexus supports LDP.

https://www.cisco.com/c/en/us/td/docs/switches/datacenter/sw/5_x/nx-os/mpls/configuration/guide/mpls_cg/mp_ldp_overview.html


Regards,
Jakob


RE: is odd number of links in lag group ok

2018-05-16 Thread Jakob Heitz (jheitz)
Many routers do not rehash everything when a link breaks.
Doing so would disturb flows that were not broken, causing possible
misordered packets or jitter.
The flows on the broken link will get rehashed, of course.

Note that even if a hash function can distribute the flows evenly,
you may get some heavy flows, so you need to expect some imbalance.

Regards,
Jakob


-Original Message-
From: Ben Cannon 

It will work fine if you have a good modern router.   Consider this; all evenly 
grouped LAGs are odd in their failed conditions.

-Ben

> On May 15, 2018, at 8:28 AM, Mark Tinka  wrote:
> 
> 
> 
>> On 15/May/18 17:20, Jared Mauch wrote:
>> 
>> Much of this depends on the hardware, software and what hashing is used 
>> inbound
>> outbound traffic directions, etc.
>> 
>> It will likely work the way you expect, but one may be warmer than the other 
>> if
>> traffic ends up overloading a single bucket in the hash.
> 
> We haven't had a major issue when loading traffic over even or odd
> links. If you have decent hardware and software, it should all be fine,
> particularly if your traffic is all or mostly IP.
> 
> If you've got non-IP traffic in there, and your box cannot look into the
> payload to determine entropy, then things could get interesting. But
> this will happen even when you have even links... it's not anything
> specific to how many member links you have in the LAG, but rather, the
> router's need to maintain per-flow load balancing with limited
> information beyond Layer 2 data.
> 
> That said, an even number of links just leaves the warm & fuzzies turned
> on :-)...
> 
> Mark.


RE: Route Reflector Client Design Question

2018-05-04 Thread Jakob Heitz (jheitz)
You could optimize the packet hop count by making smaller
but more rings. For example, make one ring with
CORE1, CORE2, PE1, PE2, PE3.
And another ring with
CORE1, CORE2, PE4, PE5.

If you configure "route-reflector-client" on the CORE,
and mesh the clients, then you can additionally configure
"bgp client-to-client reflection disable".

However, if the CORE is just sending a default route,
then you probably have default-originate and no RR clients
on the CORE. Then you don't need to disable reflection,
because it's not reflecting anyway. (reflection refers to
the reflection of routes, not reflection of packets).

You could send limited important or heavily used prefixes between
the PEs using route policy without blowing up the TCAM.

Regards,
Jakob.

-Original Message-
From: Erik Sundberg 

I have a RR Client design question..


CORE1---2x10G---CORE2
||
||
|10G Ring|
||
||
PE1--PE2--PE3--PE4--PE5


-Core1 & Core2 are RR Reflectors with full IPV4 Tables (ASR9K)
-MPLS LDP Enabled
-IGP is ISIS
-Each PE peers only with Core1 and Core2 as RR Clients with iBGP
-PE's are only receiving a default route from the Core Routers due to TCAM size 
of 20K (ASR920's\ME3800's)
-The ring does not have that much traffic on it <500m, so I do not want to use 
additional 10G ports on the Core's and is why I have it in a 10G U ring.
-Primary link to the cores is via the PE1 --- CORE1 Like. For this 
discussion the link between PE5 to CORE2 is set up as a backup link.

The scenario is I have traffic between PE2 and PE3. Since the PE's are only 
receiving a default route from the Cores. Traffic is label switch from PE2 - 
PE1 - Core1 does a IP lookup at Ingress then label switches back to 
PE1-PE2-PE3. This ends up being 5 hops and doubling the traffic on the link to 
the Cores.

My questions is how do I get traffic to go directly between the PE's without 
going to the Core Routers?

1. Can I enable iBGP between the PE's in a full mesh to allow traffic between 
the PE's without going to the core's. Or does this break the Route Reflector 
model?
2. Create a route policy on the Core's advertising routes learned from the PE's 
back to all the PE's on the ring.
3. Is this one of the down sides to U Rings?
4. Leave it alone and move on to bigger and better things



Re: AS-Path - ORF Draft

2017-10-24 Thread Jakob Heitz (jheitz)
Even though the limit is applied before policy, the dropped prefixes don't 
count towards the limit. You can have a limit of 100 and receive 1000. If you 
drop 901 post policy, it will not kill the session, even when the limit is 
applied before policy.

Thanks,
Jakob.


> Date: Sun, 22 Oct 2017 17:37:52 -0500 (CDT)
> From: Mike Hammett 

> Their device goes through prefix limit before prefix filter, so their filters 
> wouldn't even see the advertisements as the prefix limit already killed the 
> session. Raise the prefix limit so that the filters can get to work and now 
> you're vulnerable to someone else injecting a ton of routes and melting their 
> router. 
> 


RE: AS-Path - ORF Draft

2017-10-23 Thread Jakob Heitz (jheitz)
IOS-XR does not have a pre-policy prefix limit.
When the limit is reached, the session will not automatically
re-establish. It needs to be manually cleared first.

It has the extra options:
warning-only- does not drop the session.
discard-extra-paths - additionally, drops prefixes after the limit is reached.
restart- automatically re-establish the session after the timeout.

I agree with Job that the use of warning-only can lead to unexpected routing,
because there is no control over which prefixes are dropped.
This is a big hammer that only comes down when the other hammers don't work.

Thanks,
Jakob.

--
Date: Mon, 23 Oct 2017 06:57:19 -0400
From: Greg Hankins 
To: Job Snijders 
Cc: nanog@nanog.org
Subject: Re: AS-Path - ORF Draft
Message-ID: <20171023105719.gh27...@nokia.com>
Content-Type: text/plain; charset=us-ascii

Nokia SR OS defaults to pre-policy but can be configured to post-policy
by adding "post-import".

prefix-limit ipv4 100 // pre-policy
prefix-limit ipv6 100 post-import // post-policy

Greg

-- 
Greg Hankins 

-Original Message-
Date: Mon, 23 Oct 2017 12:37:13 +0200
From: Job Snijders 
To: nanog@nanog.org
Subject: Re: AS-Path - ORF Draft

On Mon, Oct 23, 2017 at 08:35:42AM +0200, Job Snijders wrote:
> > or it could compare each additional prefix received to already learned
> > prefixes and decide to drop one to make room for the new one. For
> > example you could drop the most specific routes before less specific
> > routes.
> 
> The moment a BGP implementation can do such RIB compression, it may
> indeed make sense to offer two types of limits: a 'pre-policy maximum
> prefix limit' and a 'post-policy maximum prefix limit'. The former type
> of limit would be useful in context of route leaks, the latter in
> context of protecting against overflow of the FIB capability.

Apparently this already exists and is widely available, Saku Ytti gave
me some additional information. There are various keywords available,
and they operate at different attachment points in the conceptual model.

 |  IOS XR  | Junos
 ===
  pre-policy keyword |      |  prefix-limit
 +--+
 post-policy keyword |  maximum-prefix  |  accepted-prefix-limit

 (? means the keyword does not exist)

Now I wonder what Arista EOS, Nokia SR-OS, etc offer in this regard. :-)

(screenshot here 
http://instituut.net/~job/screenshots/baf76f9c29a31d2e55454ddd.png
for those of you who can't easily view ASCII tables)

Kind regards,

Job



RE: AS PATH limits

2017-09-21 Thread Jakob Heitz (jheitz)
The consequence of keeping a route with a long AS_PATH is that it uses a little 
more memory.
Also, if you send it on, you will add one ASN and may exceed the maximum BGP 
message size and not be able to send it.
Even that is no reason to drop the incoming route.
The consequence of dropping the route is that someone loses connectivity 
because you dropped it.

The need for limiting AS_PATH length stemmed from this incident:
https://dyn.com/blog/the-flap-heard-around-the-world/

This bug has long been fixed, so it should not happen again.
However, if you want to be extra cautious, because unpatched routers may still 
be out there,
then 200 should not drop any normal route. Just keep an eye on what you are 
dropping

Thanks,
Jakob


Date: Tue, 19 Sep 2017 13:33:03 +
From: craig washington 
To: "nanog@nanog.org" 
Subject: AS PATH limits
Message-ID:



Content-Type: text/plain; charset="iso-8859-1"

Hello world.

I was wondering and forgive me if this discussions has already taken place.

How many AS PATHS are too many?

Meaning how do we determine how many to filter on transit links or public 
peering links?


Thanks in advance



Re: Long AS Path

2017-06-27 Thread Jakob Heitz (jheitz)
The reason that a private ASN in the public routing table is an error is that 
the AS Path is used to prevent loops. You may have private AS 65000 in your 
organization and I may have another private AS 65000 in my organization. If my 
ASN 65000 is in the AS path of a route sent to you, then your AS 65000 will 
drop it, thinking it were looping back.

BTW, this is different from a confederation member AS.

Thanks,
Jakob.


> Date: Mon, 26 Jun 2017 16:27:39 +
> From: Mel Beckman 
> To: Michael Hare 
> Cc: Hunter Fuller , James Bensley
>,  "nanog@nanog.org" 
> Subject: Re: Long AS Path
> Message-ID: <5cc4ba8e-8fbf-4ad4-835d-2c06265ce...@beckman.org>
> Content-Type: text/plain; charset="us-ascii"
> 
> Michael,
> 
> Filtering private ASNs is actually part of the standard. It's intrinsic in 
> the term "private ASN". A private ASN in the public routing table is a clear 
> error, so filtering them is reasonable. Long AS paths are not a clear error.'
> 
> I'm surprised nobody here who complains about long paths is has followed my 
> suggestion: call the ASN operator and ask them why they do it, and report the 
> results here. 
> 
> Until somebody does that, I don't see long path filtering as morally 
> defensible :)
> 
> -mel beckman
> 
>> On Jun 26, 2017, at 8:09 AM, Michael Hare  wrote:
>> 
>> Couldn't one make the same argument with respect to filtering private ASNs 
>> from the global table?  Unlike filtering of RFC1918 and the like a private 
>> ASN in the path isn't likely to leak RFC1918 like traffic, yet I believe 
>> several major ISPs have done just that.  This topic was discussed ~1 year 
>> ago on NANOG.
>> 
>> I do filter private ASNs but have not yet filtered long AS paths.  Before I 
>> did it I had to contact a major CDN because I would have dropped their 
>> route, in the end costing me money (choosing transit vs peering).


Re: Long AS Path

2017-06-22 Thread Jakob Heitz (jheitz)
23456 is AS_TRANS. Either your router does not support 4 byte AS or there is a 
bug at AS 12956 or AS 12956 is intentionally prepending 23456.

Thanks,
Jakob.


> 
> Date: Tue, 20 Jun 2017 23:12:45 +
> From: James Braunegg 
> To: "nanog@nanog.org" 
> Subject: Long AS Path
> Message-ID: 
> Content-Type: text/plain; charset="us-ascii"
> 
> Dear All
> 
> Just wondering if anyone else saw this yesterday afternoon ?
> 
> Jun 20 16:57:29:E:BGP: From Peer 38.X.X.X received Long AS_PATH= AS_SEQ(2) 
> 174 12956 23456 23456 23456 23456 23456 23456 23456 23456 23456 23456 23456 
> 23456 23456 23456 23456 23456 23456 23456 23456 23456 23456 23456 23456 23456 
> 23456 23456 23456 23456 ... attribute length (567) More than configured 
> MAXAS-LIMIT
> 
> Jun 20 16:15:26:E:BGP: From Peer 78.X.X.X received Long AS_PATH= AS_SEQ(2) 
> 5580 3257 12956 23456 23456 23456 23456 23456 23456 23456 23456 23456 23456 
> 23456 23456 23456 23456 23456 23456 23456 23456 23456 23456 23456 23456 23456 
> 23456 23456 23456 23456 ... attribute length (568) More than configured 
> MAXAS-LIMIT
> 
> Someone is having fun, creating weird and wonderful long AS paths based 
> around AS 23456, we saw the same pattern of data from numerous upstream 
> providers.
> 
> Kindest Regards,
> 
> James Braunegg
> 
> 


RE: WEBINAR TUESDAY: Can We Make IPv4 Great Again?

2017-03-07 Thread Jakob Heitz (jheitz)
1.1.1.1.e.f and 2.2.2.2.e.f both get translated to 192.168.e.f.

Some higher layer protocols embed IP addresses into their data.

These points make changing IP so difficult.

In addition, IPv6 has link local addresses.
This one seemingly insignificant detail causes so much code churn
and is probably responsible for 10 years of the IPv6 drag.
Thakfully, site local was deprecated.

Thanks,
Jakob.

> Date: Mon, 6 Mar 2017 22:00:45 +0100
> From: Baldur Norddahl 
> To: nanog@nanog.org
> Subject: Re: WEBINAR TUESDAY: Can We Make IPv4 Great Again?
> Message-ID: <5645714e-e468-4655-34cf-6e70aa7cf...@gmail.com>
> Content-Type: text/plain; charset=utf-8; format=flowed
> 
> That proposal is far too wordy. Here is the executive summary:
> 
> Encode extra address bits in extension headers. Add a network element
> near the destination that converts such that the destination IP of a
> packet to IP a.b.c.d with extension header containing e.f is translated
> to 192.168.e.f. In the reverse direction translate source address
> 192.168.e.f to a.b.c.d and add option header with e.f.
> 
> Executive summary end.
> 
> As far as I can tell, the only advantage of this proposal over IPv6 is
> that the network core does not need to be changed. You could communicate
> with someone that had an EZIP address regardless that your ISP did
> nothing to support EZIP.
> 
> The disadvantage is that every single server out there would need to be
> changed so it does not just drop the option headers on the reply
> packets. All firewalls updated so they do not block packets with option
> headers. All applications updated so they understand a new address format.
> 
> Servers and applications could also confuse TCP or UDP streams that are
> apparently from the same source, same port numbers, only thing that
> differentiates the streams is some option header that the server does
> not understand.
> 
> The customers of the ISP that deploys EZIP would not need to update
> anything (unless they need to communicate with other poor souls that got
> assigned EZIP addresses), however everyone else would. This is not a
> good balance. The customers would experience an internet where almost
> nothing works. It would be magnitudes worse than the experience of an
> IPv6 only network with NAT64.
> 
> It is a fix for the wrong problem. Major ISPs have IPv6 support now. It
> is the sites (=servers) that are lacking. If Twitter did not deploy IPv6
> why would you expect them to deploy EZIP? Why would some old forgotten
> site with old song texts in some backwater country somewhere?
> 
> We already have better solutions such as CGN with dual stack, NAT64,
> DS-Lite, MAP etc.
> 
> None of that is discussed in the RFC. Is the author aware of it?
> 
> Regards,
> 
> Baldur
> 


RE: Soliciting your opinions on Internet routing: A survey on BGP convergence

2017-01-11 Thread Jakob Heitz (jheitz)
When you simply bring down an ebgp session, withdraws will propagate throughout 
the network.
Soon after, the alternate routes will propagate. In the interim, some routers 
will lose connectivity.
This problem is solved by graceful shutdown.
This only works for planned shutdown
This interim time can be many minutes because of the advertisement-interval 
(MRAI timer).
A possible solution to reduce this interim to seconds instead of minutes is to 
set the MRAI timer to 0 on all routers. A potential problem with that is that 
any BGP instability in the network will cause some serious flapping.
Another alternative is to use BGP add-path (rfc7911) to distribute backup 
routes.
This will avoid the MRAI problem, but requires more memory on routers.
This also works for accidental shutdown.

Thanks,
Jakob.


> -Original Message-
> From: Jakob Heitz (jheitz)
> Sent: Tuesday, January 10, 2017 11:52 AM
> To: nanog@nanog.org; 'baldur.nordd...@gmail.com' <baldur.nordd...@gmail.com>
> Subject: RE: Soliciting your opinions on Internet routing: A survey on BGP 
> convergence
> 
> Hi Baldur,
> 
> Have you tried graceful shutdown?
> You need redundant links, but not to the same transit.
> https://tools.ietf.org/html/draft-ietf-grow-bgp-gshut-06
> This draft is expired, but it is actually implemented by several vendors.
> 
> I implemented this.
> http://www.slideshare.net/bduvivie/bgp-graceful-shutdown-ios-xr
> I added an option to configure AS-path prepends in case the gshut community 
> was not supported by peers.
> 
> Thanks,
> Jakob.
> 
> 
> > Date: Tue, 10 Jan 2017 03:51:04 +0100
> > From: Baldur Norddahl <baldur.nordd...@gmail.com>
> >
> > Hello
> >
> > I find that the type of outage that affects our network the most is
> > neither of the two options you describe. As is probably typical for
> > smaller networks, we do not have redundant uplinks to all of our
> > transits. If a transit link goes, for example because we had to reboot a
> > router, traffic is supposed to reroute to the remaining transit links.
> > Internally our network handles this fairly fast for egress traffic.
> >
> > However the problem is the ingress traffic - it can be 5 to 15 minutes
> > before everything has settled down. This is the time before everyone
> > else on the internet has processed that they will have to switch to your
> > alternate transit.
> >
> > The only solution I know of is to have redundant links to all transits.
> > Going forward I will make sure we have this because it is a huge
> > disadvantage not being able to take a router out of service without
> > causing downtime for all users. Not to mention that a router crash or
> > link failure that should have taken seconds at most to reroute, but
> > instead causes at least 5 minutes of unstable internet.
> >
> > Regards,
> >
> > Baldur


RE: Soliciting your opinions on Internet routing: A survey on BGP convergence

2017-01-10 Thread Jakob Heitz (jheitz)
Hi Baldur,

Have you tried graceful shutdown?
You need redundant links, but not to the same transit.
https://tools.ietf.org/html/draft-ietf-grow-bgp-gshut-06
This draft is expired, but it is actually implemented by several vendors.

I implemented this.
http://www.slideshare.net/bduvivie/bgp-graceful-shutdown-ios-xr
I added an option to configure AS-path prepends in case the gshut community was 
not supported by peers.

Thanks,
Jakob.


> Date: Tue, 10 Jan 2017 03:51:04 +0100
> From: Baldur Norddahl 
> 
> Hello
> 
> I find that the type of outage that affects our network the most is
> neither of the two options you describe. As is probably typical for
> smaller networks, we do not have redundant uplinks to all of our
> transits. If a transit link goes, for example because we had to reboot a
> router, traffic is supposed to reroute to the remaining transit links.
> Internally our network handles this fairly fast for egress traffic.
> 
> However the problem is the ingress traffic - it can be 5 to 15 minutes
> before everything has settled down. This is the time before everyone
> else on the internet has processed that they will have to switch to your
> alternate transit.
> 
> The only solution I know of is to have redundant links to all transits.
> Going forward I will make sure we have this because it is a huge
> disadvantage not being able to take a router out of service without
> causing downtime for all users. Not to mention that a router crash or
> link failure that should have taken seconds at most to reroute, but
> instead causes at least 5 minutes of unstable internet.
> 
> Regards,
> 
> Baldur


Re: RPKI implementation

2016-06-16 Thread Jakob Heitz (jheitz)
That is also configurable.

Thanks,
Jakob.


On Jun 16, 2016, at 4:39 AM, Randy Bush  wrote:

>> When a cache loses connectivity, the entries from that cache
>> are purged after a time interval. Default is 60 seconds
> 
> why not the poll interval for that cache server?
> 
> randy


RPKI implementation

2016-06-16 Thread Jakob Heitz (jheitz)
During the RPKI presentation there was a question about
resilience of the router if the RPKI cache loses connectivity.
The IOS-XR implementation allows multiple caches to be configured.
When a cache loses connectivity, the entries from that cache
are purged after a time interval. Default is 60 seconds and it is configurable.
A lookup of a prefix that is not loaded will return not-found.
5 seconds after the latest RPKI database update,
a refresh request is sent to each neighbor, provided that the neighbor either:
- dropped any received route due to a policy that contains validation-state, or
- received a route, the validation state of which changed.
If soft reconfiguration inbound is configured, then the refresh is avoided,
because the received paths are stored.

Thanks,
Jakob.


Re: RPKI and offline routes

2016-06-14 Thread Jakob Heitz (jheitz)
ASN 0 is used for this purpose.
Look for the word "zero" in
https://tools.ietf.org/html/rfc6907

Thanks,
Jakob.

> Date: Mon, 13 Jun 2016 17:53:45 -0500 (Central Sommerzeit)
> From: Matthias Waehlisch 
> To: Theodore Baschak 
> Cc: NANOG Operators' Group 
> Subject: Re: RPKI and offline routes
> 
> Hi,
> 
>   the creation of a ROA does not require the announcement of the prefix.
> Creation of a ROA, prefix announcement, and validation of the prefix are
> decoupled. If you are the legitimate resource holder you can create a
> ROA for this prefix (even if you don't advertise the prefix). As soon as
> the prefix is advertised, third parties can validate based on the
> created ROA.
> 
>   However, in case the hijacker is able to use the legitimate origin
> ASN, the validation outcome would be valid. You would need to assign the
> prefix to an ASN that cannot be hijacked or is dropped for other
> reasons. (Or do BGPsec. ;)
> 
> 
> Cheers
>   matthias
> 
> On Mon, 13 Jun 2016, Theodore Baschak wrote:
> 
> > Can RPKI be used with routes that are not being advertised at the moment?
> > As in to sign a route that *could* be there, but is not there presently.
> >
> > There's been several BGP hijacks that I've followed closely that
> > involved hijacking IP space as well as the ASN that would normally
> > originate it. I'm wondering if having valid ROAs/RPKI would have
> > helped in this case or not.
> >
> >
> > Theodore Baschak - AS395089 - Hextet Systems
> >



RE: Superfluous advertisement (was: Friday's Random Comment)

2016-04-30 Thread Jakob Heitz (jheitz)
Simpler, with B and C peered:

   F
  / \
 B---C
  \ /
   A

If B does not send the /24 to F,
then F will send all the traffic to C,
even if A wanted a load balance.

Maybe I could ask the community:
Why do you advertise longer prefixes with the
same nexthop as the shoter prefix?
Is it this use case, or something else?

Thanks,
Jakob.

> -Original Message-
> From: Russ White [mailto:7ri...@gmail.com]
> Sent: Saturday, April 30, 2016 12:35 PM
> To: Jakob Heitz (jheitz) <jhe...@cisco.com>; nanog@nanog.org
> Subject: RE: Superfluous advertisement (was: Friday's Random Comment)
> 
> 
> > A use case for a longer prefix with the same nexthop:
> >
> >F
> >   / \
> >  D   E
> >  |   |
> >  B   C
> >   \ /
> >A
> >
> > Suppose A is a customer of B and C.
> 
> This is possible, but only remotely probable. In the real world, D and E are
> likely peered, as are B and C. Further, it's quite possible for F to choose
> the path through E anyway, regardless of A's wishes, or even to load share
> over to the two paths. If it's really a backup path, and you don't want
> traffic on it unless the primary is completely down, then you need to not
> advertise it until you actually need it. One of the various principles of
> packet based routing is that if you advertise reachability, it means
> someone, someplace, might just choose the path you've advertised. You can't
> control what other people choose.
> 
> :-)
> 
> Russ



Superfluous advertisement (was: Friday's Random Comment)

2016-04-30 Thread Jakob Heitz (jheitz)
A use case for a longer prefix with the same nexthop:

   F
  / \
 D   E
 |   |
 B   C
  \ /
   A

Suppose A is a customer of B and C.
B has a large address space: 10.1.0.0/16.
B allocates a subset to A: 10.1.1.0/24.
B advertises the longer prefix to its backup provider C.
C propagates it to E and then to F.
B MUST advertise both 10.1.0.0/16 and 10.1.1.0/24 to D.
D MUST propagate both of them to F.
Otherwise, if F only receives 10.1.0.0/16 from D, then
F will have the longer match 10.1.1.0/24 to E,
but E is only the backup route.

Thanks,
Jakob.

> -Original Message-
> Date: Fri, 29 Apr 2016 08:17:41 -0400
> From: Alain Hebert 
> To: "'NANOG list'" 
> Subject: Friday's Random Comment - About: Arista and FIB/RIB's
> Message-ID: <00ea292f-e779-25ad-ce89-eae897e95...@pubnix.net>
> Content-Type: text/plain; charset=utf-8
> 
> While following that Arista chat...  That reminded me of that little
> afternoon project years ago.
> 
> So I decided to find new hamsters, fire up that VM, refresh the DB's and
> from the view point of a tiny 7206VXR/G1 with 2 T3 peers...
> 
> The amount of superfluous subnet advertisement drop to ~120k from
> ~166k from the previous snapshot.
> 
> And this is the distribution by country.
> 
>   country   | superfluous
> +-
>  United States  | 28254
>  Brazil | 10012
>  China  |  7537
>  India  |  6449
>  Russian Federation |  4524
>  Korea, Republic of |  4062
>  Saudi Arabia   |  3297
>  Australia  |  2989
>  Indonesia  |  2878
>  Hong Kong  |  2251
>  Thailand   |  2093
>  Canada |  2019
>  Taiwan |  1955
>  Ukraine|  1877
>  Singapore  |  1856
>  Bulgaria   |  1488
>  Argentina  |  1436
>  Japan  |  1403
>  Mexico |  1351
>  Chile  |  1271
> 
> (Damn Canada, can't break top 10 again).
> 
> PS: "Superfluous" is a nice way to say that the best path of a
> subnet is the same as his supernet.  And yes I'm aware of the Weekly
> Routing Report, I was just curious to see it by country =D.
> 
> -
> Alain Hebertaheb...@pubnix.net
> PubNIX Inc.
> 50 boul. St-Charles
> P.O. Box 26770 Beaconsfield, Quebec H9W 6G7
> Tel: 514-990-5911  http://www.pubnix.netFax: 514-990-9443



Re: Internet Exchanges supporting jumbo frames?

2016-03-18 Thread Jakob Heitz (jheitz)
What's driving the desire for larger packets?

A single bit error will drop a whole packet.
Larger packets will cause more loss. Cables will need to be
shorter or bitrates lower to compensate.

Byte overhead of packet headers?

Are we seeing degradation of packets per second in forwarding
due to the increase in IPv6? Are we seeing IPv6 packets with
hop-by hop extension headers (which forward on the slow path)?

Increasing the packet size will reduce the number of TCP
packets as well as the number of TCP ack packets.
TCP ACK packets are significantly larger in IPv6 than IPv4.

TCP slow start is faster with large MTU. Is this an issue?

Are IPv6 packets with extension headers causing performance
degradation in firewalls?

Thanks,
Jakob.



Re: Internet Exchanges supporting jumbo frames?

2016-03-18 Thread Jakob Heitz (jheitz)
You would hardly notice it.
Helium is 4 times as heavy as hydrogen, but only marginally less buoyant.

Header overhead:
Ethernet=38
IPv4=20
TCP=20
Total=78
Protocol efficiency:
1500: 1500/1578 = 95%
9000: 9000/9078 = 99%

That's 4% better for a TCP packet, not 600%.

Thanks,
Jakob.


> On Mar 18, 2016, at 6:45 PM, Tim McKee <t...@baseworx.net> wrote:
> 
> I would hazard a guess that reducing the packet header overhead *and* the 
> Ethernet interframe gap time by a factor of 6 could make enough of an 
> improvement to be quite noticeable when dealing with huge dataset transfers.
> 
> Tim McKee
> 
> -Original Message-
> From: NANOG [mailto:nanog-boun...@nanog.org] On Behalf Of Jakob Heitz (jheitz)
> Sent: Friday, March 18, 2016 18:21
> To: Dale W. Carder
> Cc: nanog@nanog.org
> Subject: RE: Internet Exchanges supporting jumbo frames?
> 
> Then it's mainly TCP slowstart that you're trying to improve?
> 
> Thanks,
> Jakob.
> 
>> -Original Message-
>> From: Dale W. Carder [mailto:dwcar...@wisc.edu]
>> Sent: Friday, March 18, 2016 3:03 PM
>> To: Jakob Heitz (jheitz) <jhe...@cisco.com>
>> Cc: nanog@nanog.org
>> Subject: Re: Internet Exchanges supporting jumbo frames?
>> 
>> Thus spake Jakob Heitz (jheitz) (jhe...@cisco.com) on Fri, Mar 18, 2016 at 
>> 09:29:44PM +:
>>> What's driving the desire for larger packets?
>> 
>> In our little corner of the internet, it is to increase the 
>> performance of a low number of high-bdp flows which are typically dataset 
>> transfers.
>> All of our non-commercial peers support 9k.
>> 
>> Dale
> 
> -
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2015.0.6189 / Virus Database: 4542/11829 - Release Date: 03/17/16


RE: Internet Exchanges supporting jumbo frames?

2016-03-18 Thread Jakob Heitz (jheitz)
Then it's mainly TCP slowstart that you're trying to improve?

Thanks,
Jakob.

> -Original Message-
> From: Dale W. Carder [mailto:dwcar...@wisc.edu]
> Sent: Friday, March 18, 2016 3:03 PM
> To: Jakob Heitz (jheitz) <jhe...@cisco.com>
> Cc: nanog@nanog.org
> Subject: Re: Internet Exchanges supporting jumbo frames?
> 
> Thus spake Jakob Heitz (jheitz) (jhe...@cisco.com) on Fri, Mar 18, 2016 at 
> 09:29:44PM +:
> > What's driving the desire for larger packets?
> 
> In our little corner of the internet, it is to increase the performance
> of a low number of high-bdp flows which are typically dataset transfers.
> All of our non-commercial peers support 9k.
> 
> Dale


RE: Environmental Graph Interpretation

2015-11-11 Thread Jakob Heitz (jheitz)
If the temperature of the floor is below the dew point, then it will sweat.
Maybe there's a cold wind blowing underneath the gap?

--Jakob


> -Original Message-
> Date: Tue, 10 Nov 2015 17:25:04 -0600
> From: "Lorell Hathcock" 
> 
> It is on the ground floor, but it is in a hut that has a wood floor that is
> raised off the ground.  There is a gap between the bottom of the floor and
> the ground.
> 
> -Original Message-
> From: valdis.kletni...@vt.edu [mailto:valdis.kletni...@vt.edu]
> Sent: Tuesday, November 10, 2015 5:13 PM
> To: Lorell Hathcock 
> Cc: 'NANOG list' 
> Subject: Re: Environmental Graph Interpretation
> 
> On Tue, 10 Nov 2015 16:48:04 -0600, "Lorell Hathcock" said:
> > Are there any one the list that would care to take a look at some
> > graphs of temperature, relative humidity and dew point that I have for two
> locations.
> > In one of the two locations, I'm having a problem with the floor
> > getting wet (condensation?).  At the other everything is just fine.
> 
> Is your moisture problem on a ground floor?  Note that even well-cured
> concrete is like 30% water, and can allow moisture to slowly migrate through
> and weep.  Usual cure is application of a proper sealant over the concrete.


RE: BGP advertise-best-external on RR

2015-09-05 Thread Jakob Heitz (jheitz)
If your network is such that only a handful of routers supply redundant paths, 
then you can set up iBGP sessions with those directly without going via route 
reflectors. You can have most routes going through reflectors and a few through 
direct BGP sessions. Not everything needs to go through route reflectors. You 
can even do both: Have a router peer with a reflector as well as directly if 
you only need the redundant routes in a few places. You will end up with 
duplicate routes, but that's not a show stopper. You can avoid duplicate routes 
with route maps. You can have multiple route reflectors with different cluster 
IDs that carry the redundant routes only. Clients can peer with multiple 
clusters. Use route maps to avoid duplicate routes. These last things get 
complicated to manage, so I'd still go for add-path if at all possible.

--Jakob


> Message: 2
> Date: Tue, 1 Sep 2015 14:51:27 +0200
> From: Mohamed Kamal 
> To: Jeff Tantsura , Diptanshu Singh
>   
> Cc: NANOG 
> Subject: Re: BGP advertise-best-external on RR
> Message-ID: <55e59f4f.8010...@noor.net>
> Content-Type: text/plain; charset=windows-1252; format=flowed
> 
> Hi,
> 
> Diverse-path will only send the second best path, and in my case I have
> three routes not two. In addition to that, every PE will have to peer
> with the RR via a second session (on the same RR, as I will not deploy a
> new standalone shadow RR) and this will increase the BGP sessions to the
> double.
> 
> Add-path will have a network-wide IOS upgrade for this BGP capability to
> be supported which is not viable now.
> 
> So, is there any other recommendation other than the internet VRF with
> different RDs solution?
> 
> Regards,
> 
> Mohamed Kamal
> Core Network Sr. Engineer
> 
> On 8/25/2015 11:37 AM, Jeff Tantsura wrote:
> > Hi,
> >
> > In your case I?d recommend to use diverse path, due to its simplicity and
> > non disruptive deployment characteristics.
> > As you know - diverse path requires additional BGP session per additional
> > (second, next, etc) path, in most cases not a problem, however mileage
> > might vary.
> >
> > To my memory, in Cisco land - it has only been implemented in IOS, not XR,
> > please check.
> >
> > Cheers,
> > Jeff
> >
> >
> >
> >
> > -Original Message-
> > From: Diptanshu Singh 
> > Date: Monday, August 24, 2015 at 10:53 PM
> > To: Mohamed Kamal 
> > Cc: "nanog@nanog.org" 
> > Subject: Re: BGP advertise-best-external on RR
> >
> >> Yes . In the case of diverse path , shadow route reflector will be the
> >> one wherever  you enable commands to trigger diverse path computation.
> >>
> >> Good thing with diverse path is that the RR-Clients don't have to have
> >> any support but bad thing is that it can only reflect One additional
> >> best-path( second best path ) .
> >>
> >> Sent from my iPhone
> >>
> >>> On Aug 24, 2015, at 2:31 PM, Mohamed Kamal  wrote:
> >>>
> >>> It's only supported on the 15.2(4)S and later not the SRE train. I
> >>> might consider an upgrade.
> >>>
> >>> One more question regarding this, can you configure the RR to be the
> >>> main and shadow RR?
> >>>
> >>> Mohamed Kamal
> >>> Core Network Sr. Engineer
> >>>
>  On 8/24/2015 9:16 PM, Diptanshu Singh wrote:
>  BGP Add-Path might be your friend . You can look at diverse-path as
>  well .
> >