Re: Networks ignoring prepends?

2024-01-24 Thread Robert Raszuk
Bill,


> https://datatracker.ietf.org/doc/html/rfc4271#section-9.1.2.2
>
> "a) Remove from consideration all routes that are not tied for having
> the smallest number of AS numbers present in their AS_PATH
> attributes."
>
> So literally, the first thing BGP does when picking the best next hop
> is to discard all but the routes with the shortest AS path.


Not really. I have never seen a BGP implementation which would do that.
That section 9 you are referring to is just informational - no specific
order in there is mandated.

Shortest AS-PATH is used as step 4 or 5 in best path selection - not to
mention Cost Communities which below links do not even consider:

https://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/13753-25.html

https://www.juniper.net/documentation/us/en/software/junos/vpn-l2/bgp/topics/concept/routing-protocols-address-representation.html

Thx,
R.


Re: Networks ignoring prepends?

2024-01-24 Thread Robert Raszuk
All,

> But that ship seems to have sailed.

The problem is well known and it consists of two orthogonal aspects:

#1  - Ability to signal the preference of which return path to choose by
arbitrary remote ASN

#2 - Actually applying this preference by remote ASN.

For #1 I have proposed some time back a new set of well known wide
communities defined in section 2.2.4 of this draft:
https://datatracker.ietf.org/doc/html/draft-ietf-idr-registered-wide-bgp-communities-02#section-2.2.4

Perhaps one day this will surface such that operators will be able to
signal their preference without extending AS-PATH or trashing the table
with more specifics.

For #2 it is quite likely that the economical aspect plays a role here. So
it could be that accepting such a preference may not be for free. But
before that happens BGP for obvious reasons should be secured and updates
should be signed. And we all know how fast that is going to happen.

Kind regards,
Robert



On Wed, Jan 24, 2024 at 5:38 AM Darrel Lewis  wrote:

>
>
> > On Jan 22, 2024, at 6:53 PM, Jeff Behrns via NANOG 
> wrote:
> >
> >>> William Herrin  wrote:
> > Until they tamper with it using localpref, BGP's default behavior with
> prepends does exactly the right thing, at least in my situation.
> >
> > I feel your pain Bill, but from a slightly different angle.  For years
> the large CDNs have been disregarding prepends.  When a source AS
> disregards BGP best path selection rules, it sets off a chain reaction of
> silliness not attributable to the transit AS's.  At the terminus of that
> chain are destination / eyeball AS's now compelled to do undesirable things
> out of necessity such as:
> >  1) Advertise specifics towards select peers - i.e. inconsistent edge
> routing policy & littering global table
> >  2) Continuing to prepending a ridiculous amount anyway
> > Gotta wonder how things would be if everyone just abided by the rules.
> >
>
> One might argue that the global routing system should allow for sites to
> signal their ingress traffic engineering preferences to remote sites in
> ways other than bloating the global routing table.  But that ship seems to
> have sailed.
>
> Regards,
>
> -Darrel
>
>
>


Re: Destination Preference Attribute for BGP

2023-08-31 Thread Robert Raszuk
Hi Michael,

> two datacenters which user traffic can egress, and if one is used we want
that traffic to return to the same
> data center. It is a problem of asymmetry. It appears the only tools we
have are AS_Path and MED, and so
> I have been searching for another solution, that is when I came across
DPA.

If there are really legitimate reasons to force the symmetry I would use
disjoined address pools in each data center and asymmetry is gone the
moment you hit commit.

And redundancy could be still accomplished at the higher layer - front end
each DC with LB or use of multiple IP addresses in each DNS record.

Best,
R.


On Wed, Aug 30, 2023 at 6:57 PM michael brooks - ESC <
michael.bro...@adams12.org> wrote:

> >With AS-PATH prepend you have no control on the choice of which ASN
> should do what action on your advertisements.
> Robert- It is somewhat this problem we are trying to resolve.
>
> >I was imagining something sexier, especially given how pretty "useless"
> AS_PATH prepending is nowadays.
> I, too, am looking for something sexy (explained below). But can you
> explain why you think AS_PATH is "useless," Mark?
>
> For background, and the reason I asked about DPA:
> Currently, our routing carries user traffic to a single data center where
> it egresses to the Internet via three ISP circuits, two carriers. We are
> peering on a single switch stack, so we let L2 "load balance" user flows
> for us. We have now brought up another ISP circuit in a second data center,
> and are attempting to influence traffic to return the same path as it
> egressed our network. Simply, we now have two datacenters which user
> traffic can egress, and if one is used we want that traffic to return to
> the same data center. It is a problem of asymmetry. It appears the only
> tools we have are AS_Path and MED, and so I have been searching for another
> solution, that is when I came across DPA. In further looking at the
> problem, BGP Communities also seems to be a possible solution, but as the
> thread has explored, communities may/may not be scrubbed upstream. So,
> presently we are looking for a solution which can be used with our direct
> peers. Obviously, if someone has a better solution, I am all ears.
>
> A bit more info: we are also looking at an internal solution which passes
> IGP metric into MED to influence pathing.
>
> To avoid TL;DR I will stop there in the hopes this is an intriguing enough
> problem to generate discussion.
>
>
>
>
> michael brooks
> Sr. Network Engineer
> Adams 12 Five Star Schools
> michael.bro...@adams12.org
> ::::::::
> "flying is learning how to throw yourself at the ground and miss"
>
>
>
> On Fri, Aug 18, 2023 at 1:39 AM Robert Raszuk  wrote:
>
>> Jakob,
>>
>> With AS-PATH prepend you have no control on the choice of which ASN
>> should do what action on your advertisements.
>>
>> However, the practice of publishing communities by (some) ASNs along with
>> their remote actions could be treated as an alternative to the DPA
>> attribute. It could result in remote PREPEND action too.
>>
>> If only those communities would not be deleted by some transit networks
>> 
>>
>> Thx,
>> R.
>>
>> On Thu, Aug 17, 2023 at 9:46 PM Jakob Heitz (jheitz) via NANOG <
>> nanog@nanog.org> wrote:
>>
>>> "prepend as-path" has taken its place.
>>>
>>>
>>>
>>> Kind Regards,
>>>
>>> Jakob
>>>
>>>
>>>
>>>
>>>
>>> Date: Wed, 16 Aug 2023 21:42:22 +0200
>>> From: Mark Tinka 
>>>
>>> On 8/16/23 16:16, michael brooks - ESC wrote:
>>>
>>> > Perhaps (probably) naively, it seems to me that DPA would have been a
>>> > useful BGP attribute. Can anyone shed light on why this RFC never
>>> > moved beyond draft status? I cannot find much information on this
>>> > other than IETF's data tracker
>>> > (https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-dpa/
>>> <https://urldefense.com/v3/__https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-dpa/__;!!IR39LLzvxw!LfO7m5JiQpZx6Lp-F2LQMBHi-YefsPMl8GdYkbrFSGWXd0G3HHj6wxEJv7_K0Y-_pIAyvahmJPXvcHAc251UDA$>)
>>> and RFC6938
>>> > (which implies DPA was in use,?but then was deprecated).
>>>
>>> I've never heard of this draft until now, but reading it, I can see why
>>> it would likely not be adopted today (not sure what the consensus would
>>> have been back in the '90's).
>>>
>>> DPA looks like MED on drugs.
&

Re: Destination Preference Attribute for BGP

2023-08-18 Thread Robert Raszuk
> it's really about efficiently *parsing and updating* communities--

Absolutely correct.

Inefficient implementations of how communities are used in inbound or
outbound policies can do a lot of harm - no doubt about that - and as you
say some surface in least convenient moments.

But the point I was making is that this is not the fault of the Community
Attribute itself but rather the poor implementation choice.

Kind regards,
RR








On Fri, Aug 18, 2023 at 10:40 PM Matthew Petach 
wrote:

>
>
> On Fri, Aug 18, 2023 at 12:31 PM Robert Raszuk  wrote:
>
>> Hi Jakob,
>>
>> On Fri, Aug 18, 2023 at 7:41 PM Jakob Heitz (jheitz) via NANOG <
>> nanog@nanog.org> wrote:
>>
>>> That's true Robert.
>>>
>>> However, communities and med only work with neighbors.
>>>
>>> Communities routinely get scrubbed because they cause increased memory
>>> usage and convergence time in routers.
>>>
>>
>> Considering that we are talking about control plane memory I think
>> the cost/space associated with storing communities is less then
>> negligible these days.
>>
>> And honestly with the number of BGP update generation optimizations I
>> would not say that they contribute to longer protocol convergences in any
>> measurable way.
>>
>> To me this is more of the no trust and policy reasons why communities get
>> dropped on the EBGP peerings.
>>
>> Cheers,
>> R.
>>
>>
> Hi Robert,
>
> Without naming any names, I will note that at some point in the
> not-too-distant past, I was part of a new-years-eve-holiday-escalation to
> $BACKBONE_ROUTER_PROVIDER when the global network I was involved with
> started seeing excessive convergence times (greater than one hour from BGP
> update message received to FIB being updated).
> After tracking down development engineer from $RTR_PROVIDER on the new
> years eve holiday, it was determined that the problem lay in assumptions
> made about how communities were stored in memory.  Think hashed buckets,
> with linked lists within each bucket.  If the communities all happened to
> hash to the same bucket, the linked list in that bucket became extremely
> long; and if every prefix coming in, say from multiple sessions with a
> major transit provider, happened to be adding one more community to the
> very long linked list in that one hash bucket, well, it ended up slowing
> down the processing to the point where updates to the FIB were still
> trickling in an hour after the BGP neighbor had finished sending updates
> across.
>
> A new hash function was developed on New Year's day, and a new version of
> code was built for us to deploy under relatively painful circumstances.
>
> It's easy to say "Considering that we are talking about control
> plane memory I think the cost/space associated with storing communities is
> less then negligible these days."
> The reality is very different, because it's not just about efficiently
> *storing* communities, it's really about efficiently *parsing and updating*
> communities--and the choices made there absolutely *DO* "contribute to
> longer protocol convergences in any measurable way."
>
> Matt
> (the names have been obscured to increase my chances of being hireable in
> the industry again at some future date.  ;)
>
>
>


Re: Destination Preference Attribute for BGP

2023-08-18 Thread Robert Raszuk
Jakob,

Considering how much various junk is being added to BGP protocol these days
communities are your least worry as far as RAM space and protocol
convergence time would be of any concern. Then you have those new concepts
of limited/trusted domains where blast radius of much higher caliber then
what communities would ever reach extends across ASNs.

It is interesting that not many folks from this list are participating in
IETF IDR WG and voice concerns in respect to new BGP extensions which in
the vast majority has nothing to do with Interdomain IPv4 or IPv6 routing.

While it is great that you keep fixing bugs I would encourage your
platform/RP designers to take a look at amazon memory and cpu prices and
make RPs a bit more powerful than average smartphones.

Cheers,
R.

On Fri, Aug 18, 2023 at 8:05 PM Jakob Heitz (jheitz) 
wrote:

> Perhaps to you Robert.
>
> I work on code and with customer issues that escalate to code.
>
>
>
> Kind Regards,
>
> Jakob
>
>
>
>
>
> *From: *Robert Raszuk 
> *Date: *Friday, August 18, 2023 at 10:59 AM
> *To: *Jakob Heitz (jheitz) 
> *Cc: *nanog@nanog.org 
> *Subject: *Re: Destination Preference Attribute for BGP
>
> Hi Jakob,
>
>
>
> On Fri, Aug 18, 2023 at 7:41 PM Jakob Heitz (jheitz) via NANOG <
> nanog@nanog.org> wrote:
>
> That's true Robert.
>
> However, communities and med only work with neighbors.
>
> Communities routinely get scrubbed because they cause increased memory
> usage and convergence time in routers.
>
>
>
> Considering that we are talking about control plane memory I think
> the cost/space associated with storing communities is less then
> negligible these days.
>
>
>
> And honestly with the number of BGP update generation optimizations I
> would not say that they contribute to longer protocol convergences in any
> measurable way.
>
>
>
> To me this is more of the no trust and policy reasons why communities get
> dropped on the EBGP peerings.
>
>
>
> Cheers,
>
> R.
>
>
>
>
>
>
>
>
>
>
>
>
>
> Even new path attributes get scrubbed, because there have been bugs
> related to new ones in the past.
>
> Here is a config snippet in XR
>
>
>
> router bgp 23456
>
> attribute-filter group testAF
>
>   attribute unrecognized discard
>
> !
>
> neighbor-group testNG
>
>   update in filtering
>
>attribute-filter group testAF
>
>
>
> The only thing that has any chance to go multiple ASes is as-path.
>
> Need to be careful with that too because long ones get dropped.
>
>
>
> route-policy testRP
>
>   if as-path length ge 200 then
>
> drop
>
>   endif
>
> end-policy
>
>
>
> Kind Regards,
>
> Jakob
>
>
>
>
>
> *From: *Robert Raszuk 
> *Date: *Friday, August 18, 2023 at 12:38 AM
> *To: *Jakob Heitz (jheitz) 
> *Cc: *nanog@nanog.org 
> *Subject: *Re: Destination Preference Attribute for BGP
>
> Jakob,
>
>
>
> With AS-PATH prepend you have no control on the choice of which ASN should
> do what action on your advertisements.
>
>
>
> However, the practice of publishing communities by (some) ASNs along with
> their remote actions could be treated as an alternative to the DPA
> attribute. It could result in remote PREPEND action too.
>
>
>
> If only those communities would not be deleted by some transit networks
> 
>
>
>
> Thx,
>
> R.
>
>
>
> On Thu, Aug 17, 2023 at 9:46 PM Jakob Heitz (jheitz) via NANOG <
> nanog@nanog.org> wrote:
>
> "prepend as-path" has taken its place.
>
>
>
> Kind Regards,
>
> Jakob
>
>
>
>
>
> Date: Wed, 16 Aug 2023 21:42:22 +0200
> From: Mark Tinka 
>
> On 8/16/23 16:16, michael brooks - ESC wrote:
>
> > Perhaps (probably) naively, it seems to me that DPA would have been a
> > useful BGP attribute. Can anyone shed light on why this RFC never
> > moved beyond draft status? I cannot find much information on this
> > other than IETF's data tracker
> > (https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-dpa/) and RFC6938
> > (which implies DPA was in use,?but then was deprecated).
>
> I've never heard of this draft until now, but reading it, I can see why
> it would likely not be adopted today (not sure what the consensus would
> have been back in the '90's).
>
> DPA looks like MED on drugs.
>
> Not sure operators want remote downstream ISP's arbitrarily choosing
> which of their peering interconnects (and backbone links) carry traffic
> from source to them. BGP is a poor communicator of bandwidth and
> shilling cost, in general. Those kinds of decisions tend to be locally
> made, and permitting outside influence could be a rather hard sell.
>
> It reminds me of how router vendors implemented GMPLS in the hopes that
> optical operators would allow their customers to build and control
> circuits in the optical domain in some fantastic fashion.
>
> Or how router vendors built Sync-E and PTP into their routers hoping
> that they could sell timing as a service to mobile network operators as
> part of a RAN backhaul service.
>
> Some things just tend to be sacred.
>
> Mark.
>
>


Re: Destination Preference Attribute for BGP

2023-08-18 Thread Robert Raszuk
Hi Jakob,

On Fri, Aug 18, 2023 at 7:41 PM Jakob Heitz (jheitz) via NANOG <
nanog@nanog.org> wrote:

> That's true Robert.
>
> However, communities and med only work with neighbors.
>
> Communities routinely get scrubbed because they cause increased memory
> usage and convergence time in routers.
>

Considering that we are talking about control plane memory I think
the cost/space associated with storing communities is less then
negligible these days.

And honestly with the number of BGP update generation optimizations I would
not say that they contribute to longer protocol convergences in any
measurable way.

To me this is more of the no trust and policy reasons why communities get
dropped on the EBGP peerings.

Cheers,
R.







> Even new path attributes get scrubbed, because there have been bugs
> related to new ones in the past.
>
> Here is a config snippet in XR
>
>
>
> router bgp 23456
>
> attribute-filter group testAF
>
>   attribute unrecognized discard
>
> !
>
> neighbor-group testNG
>
>   update in filtering
>
>attribute-filter group testAF
>
>
>
> The only thing that has any chance to go multiple ASes is as-path.
>
> Need to be careful with that too because long ones get dropped.
>
>
>
> route-policy testRP
>
>   if as-path length ge 200 then
>
> drop
>
>   endif
>
> end-policy
>
>
>
> Kind Regards,
>
> Jakob
>
>
>
>
>
> *From: *Robert Raszuk 
> *Date: *Friday, August 18, 2023 at 12:38 AM
> *To: *Jakob Heitz (jheitz) 
> *Cc: *nanog@nanog.org 
> *Subject: *Re: Destination Preference Attribute for BGP
>
> Jakob,
>
>
>
> With AS-PATH prepend you have no control on the choice of which ASN should
> do what action on your advertisements.
>
>
>
> However, the practice of publishing communities by (some) ASNs along with
> their remote actions could be treated as an alternative to the DPA
> attribute. It could result in remote PREPEND action too.
>
>
>
> If only those communities would not be deleted by some transit networks
> 
>
>
>
> Thx,
>
> R.
>
>
>
> On Thu, Aug 17, 2023 at 9:46 PM Jakob Heitz (jheitz) via NANOG <
> nanog@nanog.org> wrote:
>
> "prepend as-path" has taken its place.
>
>
>
> Kind Regards,
>
> Jakob
>
>
>
>
>
> Date: Wed, 16 Aug 2023 21:42:22 +0200
> From: Mark Tinka 
>
> On 8/16/23 16:16, michael brooks - ESC wrote:
>
> > Perhaps (probably) naively, it seems to me that DPA would have been a
> > useful BGP attribute. Can anyone shed light on why this RFC never
> > moved beyond draft status? I cannot find much information on this
> > other than IETF's data tracker
> > (https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-dpa/) and RFC6938
> > (which implies DPA was in use,?but then was deprecated).
>
> I've never heard of this draft until now, but reading it, I can see why
> it would likely not be adopted today (not sure what the consensus would
> have been back in the '90's).
>
> DPA looks like MED on drugs.
>
> Not sure operators want remote downstream ISP's arbitrarily choosing
> which of their peering interconnects (and backbone links) carry traffic
> from source to them. BGP is a poor communicator of bandwidth and
> shilling cost, in general. Those kinds of decisions tend to be locally
> made, and permitting outside influence could be a rather hard sell.
>
> It reminds me of how router vendors implemented GMPLS in the hopes that
> optical operators would allow their customers to build and control
> circuits in the optical domain in some fantastic fashion.
>
> Or how router vendors built Sync-E and PTP into their routers hoping
> that they could sell timing as a service to mobile network operators as
> part of a RAN backhaul service.
>
> Some things just tend to be sacred.
>
> Mark.
>
>


Re: Destination Preference Attribute for BGP

2023-08-18 Thread Robert Raszuk
Jakob,

With AS-PATH prepend you have no control on the choice of which ASN should
do what action on your advertisements.

However, the practice of publishing communities by (some) ASNs along with
their remote actions could be treated as an alternative to the DPA
attribute. It could result in remote PREPEND action too.

If only those communities would not be deleted by some transit networks


Thx,
R.

On Thu, Aug 17, 2023 at 9:46 PM Jakob Heitz (jheitz) via NANOG <
nanog@nanog.org> wrote:

> "prepend as-path" has taken its place.
>
>
>
> Kind Regards,
>
> Jakob
>
>
>
>
>
> Date: Wed, 16 Aug 2023 21:42:22 +0200
> From: Mark Tinka 
>
> On 8/16/23 16:16, michael brooks - ESC wrote:
>
> > Perhaps (probably) naively, it seems to me that DPA would have been a
> > useful BGP attribute. Can anyone shed light on why this RFC never
> > moved beyond draft status? I cannot find much information on this
> > other than IETF's data tracker
> > (https://datatracker.ietf.org/doc/draft-ietf-idr-bgp-dpa/) and RFC6938
> > (which implies DPA was in use,?but then was deprecated).
>
> I've never heard of this draft until now, but reading it, I can see why
> it would likely not be adopted today (not sure what the consensus would
> have been back in the '90's).
>
> DPA looks like MED on drugs.
>
> Not sure operators want remote downstream ISP's arbitrarily choosing
> which of their peering interconnects (and backbone links) carry traffic
> from source to them. BGP is a poor communicator of bandwidth and
> shilling cost, in general. Those kinds of decisions tend to be locally
> made, and permitting outside influence could be a rather hard sell.
>
> It reminds me of how router vendors implemented GMPLS in the hopes that
> optical operators would allow their customers to build and control
> circuits in the optical domain in some fantastic fashion.
>
> Or how router vendors built Sync-E and PTP into their routers hoping
> that they could sell timing as a service to mobile network operators as
> part of a RAN backhaul service.
>
> Some things just tend to be sacred.
>
> Mark.
>
>


Re: Outbound Route Filtering (ORF) vendor support

2021-08-20 Thread Robert Raszuk
> This means you'd need to tag EVERYTHING - and that may be operationally
> problematic for Internet routes.

When I wrote my note I envisioned that RS on inbound may tag routes with
RTs (based on the very same communities you would filter without RTs)

Then enabling RTC SAFI would be pretty easy. I think with IOS-XE RS and
IOS-XE client this could even work today without much effort - but I will
say honestly that I have not tried it.

Of course native filtering based on communities itself may also be cool.

Last drops of updated based on community policy is cheap so perhaps client
can just  filter between ajd-rib to bgp-rib interesting routes locally
without signalling. After all the only bigger churn is at original session
up. Then subsequent BGP updates usually would be pretty painless.

Best,
R.



On Fri, Aug 20, 2021 at 9:34 PM Jeffrey Haas  wrote:

> On Fri, Aug 20, 2021 at 04:04:35PM -0300, Douglas Fischer wrote:
> > About the cone definition (by AS-SET) of IXPs... This is an especially
> > important thing.
> > But, unless some external force come to push the IXPs to do it, I don't
> see
> > that we will have that so soon.
>
> The IXP would need to have a mechanism that fits nicely into their route
> server and operational infrastructure.  The mechanism I was referring to
> previously for having it in their IRR was how the RSng infrastructure Merit
> operated years ago worked.  In those days, the route server was the ISI
> software.
>
> (Note that this is historical.)
>
> > About the use of RT-Constrain as a "please give that" tool, Robert
> > mentioned SAFI 1, but...
> > I don't see how to use that on the actual BGP engines on the tradicional
> > BGP sessions. Even considering semantic limitation you mentioned.
>
> Code-wise, it's simple.
>
> Operationally, it's an interesting mess.  Rt-Constrain is a filter that
> says
> "if you have one of these Extended Communities, you can send it".  This
> means you'd need to tag EVERYTHING - and that may be operationally
> problematic for Internet routes.
>
> Some of the related issues are tangentially covered in a proposal to do
> Rt-Constrain on things other than just Extended Communities.
>
>
> https://datatracker.ietf.org/doc/html/draft-zzhang-idr-bgp-rt-constrains-extension-01
>
> > I was reading some drafts and this one caught my attention.
> > https://datatracker.ietf.org/doc/draft-ietf-idr-rpd/
> >
> > That idea of Wide Communities is a one-fit-all tool.
> > Maybe using the feature that will come from this Draft on another way, it
> > could do the "please give that" job.
>
> While I'm clearly a fan of Wide communities, I'd suggest that running an
> entire deployment via the -rpd mechanism still seems operationally
> challenging.  I guess we'll see how that works out.
>
> -- Jeff
>


Re: Outbound Route Filtering (ORF) vendor support

2021-08-19 Thread Robert Raszuk
Hi Doug,

But what you need you can do today in any shipping decent implementation of
BGP using RTC.

https://datatracker.ietf.org/doc/html/rfc4684

While originally designed for L3VPNs long ago the use of RTC has been
extended for other address families including SAFI 1.

As a matter of fact because this mechanism is already in place few of the
ORF extensions did not move forward.

Many thx,
R.


On Thu, Aug 19, 2021 at 6:19 AM Douglas Fischer 
wrote:

> Thanks Jeffrey!
>
> Well, I invested 15 minutes passing my eyes on the IDR list archive Joel
> mentioned(scary!).
> You were very concise describing ll that discussion in such polide way.
>
> I agree that without combining prefix-list and as-path, the
> effectiveness of ORF, considering its initial purpose, the pros and cons
> does not pay themselves.
>
>
> But (there is always a but), I was imagining a different use
> for ext-community-orf !
>
> Considering scenarios like:
>  -  Route-Servers of big IXPs, now days with almost 200K routes.
>  - Transit providers sending its own point of view of DFZ with almos 900K
> routes.
> On both cases, informative communities are an excelente way to decide
> "what is good for my ASN, and what is not".
>
> Yes, I know that is possible to filter based on that after receiving those
> routes.
> But it takes computational effort on both sides to do that.
> And I imagine that comparing to AS-Path Regex, the needed computational
> effort and the complexity of the logics to do filtering based on
> community-list are much smaller.
>
>
> So, if I could say:
>  "Hey Mr. Route-Server... how are you?
>  Could you please not send-me the routes that are tagged with the
> community ?
>  And after that, send-me just the routes that are tagged with the
> community ?"
> In a Route-Server context, beyond reduce the number of BGP Messages that
> would be great for the CPU/Memory consumption both, RS and RS-Client.
>
> Or, in a Transit context...
> 1 - Customer opens a ticket with support team to set the export filter to
> send only default-route.
> 2 - Customer, 5 days later, opens a ticket with support team re-adjust the
> export filter, now sending full-routing.
> 3 - Customer, on next month, opens another ticket with support team to
> send only the cone at right of the ASN of ITP.
> With a good and public informative communities policy and
> ext-community-orf, the transit customer could change what his router will
> receive from the BGP transit Peer, depending only on himself.
>
>
> Well... I don't really know how complex is to deal with that again on a WG.
> But I would like to see that.
>
>
>
> Em qua., 18 de ago. de 2021 às 20:11, Jeffrey Haas 
> escreveu:
>
>> ORFs are a challenging feature and haven't gotten a lot of deployment for
>> a number of reasons.
>>
>> At a high level, they're a very coarse filter.  Since each new ORF type
>> adds to the logical AND condition, you start having to be more and more
>> permissive in what you permit in the policy.  Since a significant amount of
>> common ISP policies require matching things in tuples, this doesn't
>> translate super well into many types of automatically generated ORFs.
>>
>> The ext-community-orf feature was effectively supplanted by Rt-Constrain
>> (RFC 4684).
>>
>> The as-path ORF was challenging because different vendors have different
>> ideas about what "regex" means and what the input tokens are.  Consider for
>> example Juniper vs. Cisco regex matching.  The abstract fix would have been
>> to define a regex that is for the feature.  I half suspect if people pushed
>> on this these days, they'd want PCRE. :-)
>>
>> The RD-ORF work is part of some ongoing discussion about how to deal with
>> VRF overwhelm (prefix-limit exceed).
>>
>> -- Jeff (IDR co-chair)
>>
>> On Aug 18, 2021, at 1:10 PM, Douglas Fischer 
>> wrote:
>>
>> Hello!
>>
>> I also found a recent draft(expires Novembre 2021) about using Route
>> Distinguisher as a Value on ORF.
>> https://datatracker.ietf.org/doc/draft-wang-idr-rd-orf/
>>
>>
>>
>>
>> Em qua., 18 de ago. de 2021 às 11:41, Humberto Galiza <
>> humbertogal...@gmail.com> escreveu:
>>
>>> Hi,
>>>
>>> Is anyone aware of any vendor that supports Outbound Route Filtering
>>> (ORF) based on anything other than prefix-lists?
>>>
>>> I found these two old IETF drafts (both expired :-/) which supported
>>> the idea of filtering based on community and as-path respectively, but
>>> I wasn't able to understand if they were ever discussed at the WG and
>>> if there was any outcome of the discussion (I suspect the authors are
>>> no longer even working with the mentioned companies in the drafts):
>>>
>>> -
>>> https://datatracker.ietf.org/doc/html/draft-chen-bgp-ext-community-orf-02
>>> - https://datatracker.ietf.org/doc/html/draft-ietf-idr-aspath-orf-13
>>>
>>> Any info is very much appreciated.
>>>
>>> Thanks,
>>>
>>
>>
>> --
>> Douglas Fernando Fischer
>> Engº de Controle e Automação
>>
>>
>>
>
> --
> Douglas Fernando Fischer
> Engº de 

Re: SRm6 (was:SRv6)

2020-09-17 Thread Robert Raszuk
Spot on.

And on the point of protection ... in all cases it is orthogonal to the
service itself. If you want to use it you enable it regardless if your
packet's transport is IPv4, IPv6, MPLS or any SR flavor.

Sure if you need to traffic engineer your services some form of path
control is required. It can be stack of SIDs, it can be pre-signalled paths
or it can be pure encap-decap on selected anchor points. Your network -
your choice.

Thx,
R.


On Thu, Sep 17, 2020 at 11:07 AM Saku Ytti  wrote:

> On Thu, 17 Sep 2020 at 11:03, James Bensley 
> wrote:
>
> > MPLSoUDP lacks transport engineering features  like explicit paths, FRR
> LFA and FRR rLFA, assuming only a single IP header is used for the
> transport abstraction [1]. If you want stuff like TI-LFA (I assume this is
> supported in SRm6 and SRv6, but I'm not familiar with these, sorry if that
> is a false assumption) you need additional transport headers or a stack of
> MPLS labels encapped in the UDP header and then you're back to square one.
>
> One of us has confusion about what MPLSoUDP is. I don't run it, so might
> be me.
>
> SPORT == Entropy (so non-cooperating transit can balance)
> DPORT == 6635 (NOT label)
> Payload = MPLS label(s)
>
> Whatever MPLS can do MPLSoUDP can, by definition, do. It is just
> another MPLS point-to-point adjacency after the MPLSoUDP
> abstraction/tunnel.
>
> --
>   ++ytti
>


Re: BFD for routes learned trough Route-servers in IXPs

2020-09-17 Thread Robert Raszuk
>
> If the traffic is that important then the public internet is the wrong
> way to transport it.


Nonsense.

It is usually something said by those who do not know how to use Internet
as a transport in a reliable way between two endpoints.

In your books what is Internet good for ? Torrent and porn ?

>  The internet has convergence times up to multiple minutes.

It does not matter how long does it take to "converge" any single path.

Hint: Consider using multiple disjoined paths and you see that for vast
majority of "Internet failures" the connectivity restoration time would be
very close to your RTT time between your endpoints.

Rgs,
R.


Re: SRm6 (was:SRv6)

2020-09-16 Thread Robert Raszuk
Hi Ron,

>  If you want an IPv6 underlay for a network offering VPN services

And what's wrong again with MPLS over UDP to accomplish the very same with
simplicity ?

MPLS - just a demux label to a VRF/CE
UDP with IPv6 header plain and simple

+ minor benefit: you get all of this with zero change to shipping hardware
and software ... Why do we need to go via decks of SRm6 slides and new wave
of protocols extensions ???

Best,
Robert.


On Wed, Sep 16, 2020 at 10:17 PM Ron Bonica via NANOG 
wrote:

> Folks,
>
>
>
> If you want an IPv6 underlay for a network offering VPN services, it makes
> sense to:
>
>
>
>- Retain RFC 4291 IPv6 address semantics
>- Decouple the TE mechanism from the service labeling mechanism
>
>
>
> Please consider the TE mechanism described in
> draft-bonica-6man-comp-rtg-hdr and the service labeling mechanism described
> in draft-bonica-6man-vpn-dest-opt. These can be deployed on a mix and match
> basis. For example can deploy:
>
>
>
>- Draft-bonica-6man-vpn-dest-opt only, allowing traffic to follow the
>least-cost path from PE to PE.
>- Deploy draft-bonica-6man-vpn-dest-opt only, using a legacy method
>(VXLAN, RFC 4797) to label services.
>
>
>
> In all cases, the semantic of the IPv6 address is unchanged. There is no
> need to encode anything new in the IPv6 address.
>
>
>
>
> Ron
>
>
>
> Juniper Business Use Only
>


Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN reserved to "export-only-to"?'

2020-09-09 Thread Robert Raszuk via NANOG
It's not about numbers ... it's about ability to uniformly express policy
with chain of arguments.

See even with large communities you can define a policy with an
unstructured parameter and single action then you need to put it on all of
your boxes to act upon.

Is it possible to perhaps express it there to do what you need today or
what you think is possible today.

Imagine if you would be sending BGP updates between your internal peers and
tell each peer how to read the encoding ... Doable - sure. Good idea - not
quite.

R.






On Wed, Sep 9, 2020 at 5:19 PM Mark Tinka  wrote:

>
>
> On 9/Sep/20 15:25, Robert Raszuk wrote:
>
> That's not quite true.
>
> See the entire idea behind defining a common mechanism for signalling
> policy in communities in a flexible way for both intra and inter-domain use
> is to help you to use the same encoding acros policy engines of many
> vendors.
>
> I would actually risk to say that it could be even more applicable
> intra-domain then inter-domain.
>
> See the crux of the thing is that this is not just about putting bunch of
> type-codes into IANA reg. It is much more about uniform encoding for your
> actions with optional parameters across vendors.
>
> In fact the uphill on the implementation side is not because signalling
> new value in BGP is difficult to encode ... it is much more about taking
> those values and translating those to the run time policies in a
> flexible way.
>
>
> But how does that scale for vendors? Let me speak up for them on this one
> :-).
>
> We are now giving them extra work to write code to standardize communities
> for internal purposes. What extra benefit does that provide in lieu of the
> current method where Juniper send 1234:9876 to Cisco, and Cisco sees
> 1234:9876?
>
> Should a vendor be concerned about what purpose an internal community
> serves, as long as it does what the Autonomous System wants it to do?
>
> Unless I am totally misunderstanding your goal.
>
> Mark.
>


Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN reserved to "export-only-to"?'

2020-09-09 Thread Robert Raszuk via NANOG
>
> Well, the proposed de facto standard is only useful for what we need to
> signal outside of the AS.


That's not quite true.

See the entire idea behind defining a common mechanism for signalling
policy in communities in a flexible way for both intra and inter-domain use
is to help you to use the same encoding acros policy engines of many
vendors.

I would actually risk to say that it could be even more applicable
intra-domain then inter-domain.

See the crux of the thing is that this is not just about putting bunch of
type-codes into IANA reg. It is much more about uniform encoding for your
actions with optional parameters across vendors.

In fact the uphill on the implementation side is not because signalling new
value in BGP is difficult to encode ... it is much more about taking those
values and translating those to the run time policies in a flexible way.

Thx,
R.


Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN reserved to "export-only-to"?'

2020-09-09 Thread Robert Raszuk via NANOG
And use of BGP without IGP left and right when even today bunch of DCs can
do just fine with current IGPs scaling wise is IMO not a good thing.

Thx
R.

On Wed, Sep 9, 2020, 10:55 Jeff Tantsura via NANOG  wrote:

> I don’t think, anyone has proposed to use ‘’reserved ASNs” as a BCP,
> example of “ab”use of ASN0 is a de-facto artifact (unfortunate one).
> My goal would be to provide a viable source of information to someone who
> is setting up a new ISP and has a very little clue as where to start. Do’s
> and don’t’s wrt inter-domain communities use.
>
> I really enjoyed the difference RFC7938 (Use of BGP for Routing in
> Large-Scale Data Centers) made, literally 100s of companies have used it
> to educate themselves/ implemented their DC networking.
>
> Cheers,
> Jeff
>
> On Sep 9, 2020, at 10:04, adam via NANOG  wrote:
>
> 
>
> I don’t agree with the use of reserved ASNs, let alone making it BCP,
> cause it defeats the whole purpose of the community structure.
>
> Community is basically sending a message to an AS. If I want your specific
> AS to interpret the message I set it in format YOUR_ASN:,
> your AS in the first part of the community means that your rules of how to
> interpret the community value apply.
>
> Turning AS#0 or any other reserved AS# into a “broadcast-AS#” in terms of
> communities (or any other attribute for that matter) just doesn’t sit right
> with me (what’s next? multicast-ASNs that we can subscribe to?).
>
> All the examples in Robert’s draft or wide community RFC, all of them use
> an example AS# the community is addressed to (not some special reserved
> AS#).
>
>
>
> Also should something like this become standard it needs to be properly
> standardized and implemented as a well-known community by most vendors
> (like RFCs defining the wide communities or addition to standard
> communities like no_export/no_advertise/…). This would also eliminate the
> adoption friction from operators rightly claiming “my AS my rules”.
>
>
>
> adam
>
>
>
>
>
> *From:* NANOG  *On
> Behalf Of *Douglas Fischer via NANOG
> *Sent:* Tuesday, September 8, 2020 4:56 PM
> *To:* NANOG 
> *Subject:* BGP Community - AS0 is de-facto "no-export-to" marker - Any
> ASN reserved to "export-only-to"?'
>
>
>
> Most of us have already used some BGP community policy to no-export some
> routes to some where.
>
> On the majority of IXPs, and most of the Transit Providers, the very
> common community tell to route-servers and routers "Please do no-export
> these routes to that ASN" is:
>
>  -> 0:
>
>
>
> So we could say that this is a de-facto standard.
>
>
>
>
>
> But the Policy equivalent to "Please, export these routes only to that
> ASN" is very varied on all the IXPs or Transit Providers.
>
>
>
>
>
> With that said, now comes some questions:
>
> 1 - Beyond being a de-facto standard, there is any RFC, Public Policy, or
> something like that, that would define 0: as "no-export-to"
> standard?
>
>
>
> 2 - What about reserving some 16-bits ASN to use :
> as "export-only-to" standard?
>
> 2.1 - Is important to be 16 bits, because with (RT) extended communities,
> any ASN on the planet could be the target of that policy.
>
> 2.2 - Would be interesting some mnemonic number like 1000 / 1 or so.
>
>
>
> --
>
> Douglas Fernando Fischer
> Engº de Controle e Automação
>
>


Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN reserved to "export-only-to"?'

2020-09-09 Thread Robert Raszuk via NANOG
Mark,

Nope .. it is the other way around.

It is all easy if you look from your network centric view.

But if I am connected to 10 ISPs in each POP I have to build 10 different
egress policies, each embedding custom policy, teach NOC to understand it
etc...

I think if there is a defined way to express prepend N times to my ISP
peers across all uplinks or lower local pref in my ISP network in a same
way to group of ISPs I see the value.

Best Regards,
R.


On Wed, Sep 9, 2020, 06:36 Mark Tinka via NANOG  wrote:

>
>
> On 8/Sep/20 23:22, Douglas Fischer via NANOG wrote:
>
> Exactly Mike!
>
> The Idea would be to define some base levels, to make the creations of
> route-filtering simpler to everyone in the world.
> And what comes beyond that, is in charge of each autonomous system.
>
> It would make the scripting and templates easier and would avoid
> fat-fingers.
>
>
> Are we saying that what individual operators design for their own networks
> is "complicated", and that coalescing around a single "de facto" standard
> would simplify that?
>
> Mark.
>


Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN reserved to "export-only-to"?'

2020-09-09 Thread Robert Raszuk via NANOG
Mark,

On last point yes. The entire idea behind flow spec is to work inter-as to
mitigate DDoS as close to a source as possible.

And if you validate against advertising reachability what's the problem ?

And as far as wide they just let you structure your community in a common
way. It is both to customers or to others as you choose. Nothing there is
about trust. It is all about mechanics how you pass embedded instructions.

Best,
R.






On Wed, Sep 9, 2020, 06:25 Mark Tinka  wrote:

>
>
> On 8/Sep/20 20:15, Robert Raszuk wrote:
>
> > This does not require any more trust for say directly connected peers
> > more then today when you publish communities on the web page.
>
> I'd tend to disagree.
>
> Trusting your direct peer to not send you default or to have a 24/7 NOC
> to handle connectivity issues is not the same level of trust I'd afford
> them to send me a community that told my network what to announce to my
> other eBGP neighbors or not.
>
> Of course, I am probably less trusting than most, so I'm not
> recommending anyone follow my advice :-).
>
>
> > It is not about opening up your network. It is about expressing your
> > policy in a common way in the exact say amount as you would open up
> > your network today.
>
> I can express my policy, publicly. But I can also indicate who has the
> power to implement that expression on my side.
>
>
> > Notice that in addition to common types there is equal amount of
> > space left for operator's define types. It is just that the structure
> > of community can take number of arguments used during execution -
> > that's all.
>
> That is all good and well, and works beautifully within an operator's
> network, which is the point of the capability.
>
> Extending that to non-customer networks is not technically impossible.
> It's just a question of trust.
>
> It's not unlike trusting your customers to send you FlowSpec
> instructions. No issues technically, but do you want to do it?
>
> Mark.
>
>


Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN reserved to "export-only-to"?'

2020-09-08 Thread Robert Raszuk via NANOG
Mark,

This does not require any more trust for say directly connected peers more
then today when you publish communities on the web page.

It is not about opening up your network. It is about expressing your policy
in a common way in the exact say amount as you would open up your network
today.

Notice that in addition to common types there is equal amount of space left
for operator's define types. It is just that the structure of community can
take number of arguments used during execution - that's all.

Thx,
R.



On Tue, Sep 8, 2020 at 8:10 PM Mark Tinka  wrote:

>
>
> On 8/Sep/20 18:41, Robert Raszuk wrote:
>
> > I don't think this is the ask here.
> >
> > Today NO_EXPORT takes no parameters. I think it would be of benefit to
> > all to be able to signal NO_EXPORT TO ASN_X in a common (std) way
> > across all of my peers connected to ASN_X. Moreover policy on all
> > vendors could understand it too without you worrying to match
> > YOUR_STRING and translate into some local policy.
> >
> > That is by no means taking away anything you have at your fingertips
> > .. it just adds an option to talk common policy language.
>
> This already happens today, but mostly in a commercial relationship
> (customer and provider).
>
> While not technically impossible, I struggle to see operators opening up
> their networks to peers they hardly personally (or commercially) know
> with such a feature, custom or standardized.
>
> I suppose the bigger question is - can we trust each other, as peers,
> with such access to each other's networks?
>
> Mark.
>


Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN reserved to "export-only-to"?'

2020-09-08 Thread Robert Raszuk via NANOG
Mark,

> The standard already exists... "NO_EXPORT".

I don't think this is the ask here.

Today NO_EXPORT takes no parameters. I think it would be of benefit to all
to be able to signal NO_EXPORT TO ASN_X in a common (std) way across all of
my peers connected to ASN_X. Moreover policy on all vendors could
understand it too without you worrying to match YOUR_STRING and translate
into some local policy.

That is by no means taking away anything you have at your fingertips .. it
just adds an option to talk common policy language.

Cheers,
R.





On Tue, Sep 8, 2020 at 6:23 PM Mark Tinka via NANOG  wrote:

>
>
> On 8/Sep/20 17:55, Douglas Fischer via NANOG wrote:
>
> Most of us have already used some BGP community policy to no-export some
> routes to some where.
>
> On the majority of IXPs, and most of the Transit Providers, the very
> common community tell to route-servers and routers "Please do no-export
> these routes to that ASN" is:
>
>  -> 0:
>
> So we could say that this is a de-facto standard.
>
>
> But the Policy equivalent to "Please, export these routes only to that
> ASN" is very varied on all the IXPs or Transit Providers.
>
>
> With that said, now comes some questions:
>
> 1 - Beyond being a de-facto standard, there is any RFC, Public Policy, or
> something like that, that would define 0: as "no-export-to"
> standard?
>
> 2 - What about reserving some 16-bits ASN to use :
> as "export-only-to" standard?
> 2.1 - Is important to be 16 bits, because with (RT) extended communities,
> any ASN on the planet could be the target of that policy.
> 2.2 - Would be interesting some mnemonic number like 1000 / 1 or so.
>
>
> The standard already exists... "NO_EXPORT". Provided ISP's or exchange
> points can publish their own local values to match that within their
> network, I believe they can do whatever they want, since it's
> locally-significant.
>
> I'm not sure we want to go down the trail of standardizing a "de facto"
> usage. Just like QoS, it may be doomed as different operators define what
> it means for them.
>
> Mark.
>


Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN reserved to "export-only-to"?'

2020-09-08 Thread Robert Raszuk via NANOG
Hi Douglas,

Just FYI I have tried to capture most common use cases of communities and
register them as part of a wide-community effort in IANA.

https://tools.ietf.org/html/draft-ietf-idr-registered-wide-bgp-communities-02

That draft is pending standardization of wide-communities itself.

You are obviously very welcome to either reuse some of this work or support
it :)

Kind regards,
R.

On Tue, Sep 8, 2020 at 5:58 PM Douglas Fischer via NANOG 
wrote:

> Most of us have already used some BGP community policy to no-export some
> routes to some where.
>
> On the majority of IXPs, and most of the Transit Providers, the very
> common community tell to route-servers and routers "Please do no-export
> these routes to that ASN" is:
>
>  -> 0:
>
> So we could say that this is a de-facto standard.
>
>
> But the Policy equivalent to "Please, export these routes only to that
> ASN" is very varied on all the IXPs or Transit Providers.
>
>
> With that said, now comes some questions:
>
> 1 - Beyond being a de-facto standard, there is any RFC, Public Policy, or
> something like that, that would define 0: as "no-export-to"
> standard?
>
> 2 - What about reserving some 16-bits ASN to use :
> as "export-only-to" standard?
> 2.1 - Is important to be 16 bits, because with (RT) extended communities,
> any ASN on the planet could be the target of that policy.
> 2.2 - Would be interesting some mnemonic number like 1000 / 1 or so.
>
> --
> Douglas Fernando Fischer
> Engº de Controle e Automação
>


Re: [outages] Major Level3 (CenturyLink) Issues

2020-09-03 Thread Robert Raszuk
And just to add just a little bit of fuel to this fire let me share that
the base principle of BGP spec mandating to withdraw the routes when the
session goes down could be in the glory of IETF soon a history :(

It started with the proposal to make BGP state "persistent":
https://tools.ietf.org/html/draft-uttaro-idr-bgp-persistence-00

Now it got smoothed and improved a bit, but the effect is still the same -
keep the routes and do not withdraw when session goes down:
https://tools.ietf.org/html/draft-ietf-idr-long-lived-gr-00

Sure it is up to the operator discretion to enable it or not. But soon we
can no longer blame such behaviour as violation of BGP RFC if this proceeds
forward to formal RFC.

Hint: Max LLST value (24 bits in seconds) is over 194 days of prefix
expiration not just mentioned here with dislike 5 hours or so :)

Best
R,.


On Thu, Sep 3, 2020 at 6:20 PM Mark Tinka  wrote:

>
>
> On 2/Sep/20 15:12, Baldur Norddahl wrote:
>
> > I am not buying it. No normal implementation of BGP stays online,
> > replying to heart beat and accepting updates from ebgp peers, yet
> > after 5 hours failed to process withdrawal from customers.
>
> A BGP RFC spec. is not the same thing as a vendor translating that spec.
> into code. If it were, we'd never need this list.
>
> Triple the effort when deployed and operated at scale.
>
> Mark.
>


Re: RPKI for dummies

2020-08-24 Thread Robert Raszuk
Sure thing :)

Btw my point was to avoid the potential impression that origin validation
brings any real security to bgp.

Cheers,
R.


On Mon, Aug 24, 2020 at 3:12 PM John Kristoff  wrote:

> On Mon, 24 Aug 2020 13:01:15 +
> Robert Raszuk  wrote:
>
> > I would not say that either S-BGP nor so-BGP were precursors to BGP
> > origin validation ( I am assuming this is what you are referring to
> > as "system we have today").
>
> I would consider origin validation as just one application of the
> system we have today.  Does that sound better?
>
> John
>


Re: RPKI for dummies

2020-08-24 Thread Robert Raszuk
John,

> Two precursors to the system we have today.

I would not say that either S-BGP nor so-BGP were precursors to BGP origin
validation ( I am assuming this is what you are referring to as "system we
have today").

If I recall, securing BGP and validating src ASN were independent projects
both aiming at completely different goals. Former was to assure no one
could hijack your prefixes along the path and latter to detect someone fat
fingering your prefix or ASN.

Thx,
R.



On Mon, Aug 24, 2020 at 2:43 PM John Kristoff  wrote:

> On Sun, 23 Aug 2020 12:40:19 +
> Dovid Bender  wrote:
>
> > Ok. So here is another n00b question. Why don't we have something
> > where when we advertise IP space we also pass along a cert [...]
>
> Take a look at:
>
>   Stephen Kent, Charles Lynn, and Karen Seo. 2000. Secure border gateway
>   protocol (S-BGP). IEEE Journal on Selected areas in Communications 18, 4
> (2000),
>   582–592.
>
> and
>
>   Russ White. 2003. Securing BGP: soBGP. Internet Protocol Journal 6, 3
>   (Sept. 2003), 15–22.
>
> Two precursors to the system we have today.  Both proposed some form of
> including PKI-related matter in BGP messages.  Neither system gained
> much actual traction outside of the design phase as far as I know.
> Some might suggest that a lot of time was spent debating how to do it
> with little actual progress or experimentation done.  The current
> approach has echoes of those ideas with the obvious difference as you
> imply, it is independent from BGP.  This poses some challenges to
> providing a complete solution, but was probably necessary for deployment
> and might prove useful if something other than wants to BGP uses it.
>
> John
>


Re: Has virtualization become obsolete in 5G?

2020-08-04 Thread Robert Raszuk
>   I doubt we want to move away from those concepts.

I think we all do - except technology is not there yet. Just imagine if
over a single piece of fiber you will get infinite bandwidth delivered over
unlimited modulation frequency spectrum  ...

IMHO till real true optical switching is a commodity we are stuck with
statistical multiplexing.

But optimistically I think time will come when you will be able to
setup end to end optical paths in true any to any fashion with real end to
end resource guarantees. Then next generations will be looking at current
routers like we look today at strowger telephone switches  :)

Cheers,
R.

PS. All of the current attempts to turn IP statistical multiplexing into
network slicing or deterministic networks are far from scale or practical
deployments (IMO).



On Tue, Aug 4, 2020 at 5:18 PM Mark Tinka  wrote:

>
>
> On 4/Aug/20 16:56, Etienne-Victor Depasquale wrote:
>
> > The survey I pointed to suggests that hard slicing is the least
> > preferred option among survey respondents.
>
> That's because the very nature of DWDM, Ethernet, IP, MPLS and VM's is
> all about re-using the same infrastructure over and over again for it to
> make commercial sense.
>
> I doubt we want to move away from those concepts.
>
> We rely on many services today delivered over the public Internet that
> virtualize and still perform. Even good ol' video streaming, which was
> predicted to break the Internet.
>
> So not sure what applications are driving the demand for "greater QoS"
> on 5G networks, in real terms.
>
> Mark.
>


Re: Issue with Noction IRP default setting (Was: BGP route hijack by AS10990)

2020-08-02 Thread Robert Raszuk
Hi Ca,

> Noction is sold to ISPs, aka transit AS, afaik

Interesting.

My impression always was by talking to Noction some time back that mainly
what they do is a flavor of performance routing.  But this is not about
Noction IMHO.

If I am a non transit ASN with N upstream ISPs I want to exit not in a hot
potato style ... if I care about my services I want to exit the best
performing way to reach back customers. That's btw what Cisco PFR does or
Google's Espresso or Facebook Edge Fabric etc ...

And you have few vendors offering this as well as bunch of home grown tools
attempting to do the same. Go and mandate that all of them will do
NO-EXPORT if they insert any routes ... And we will see more and more of
those type of tools coming.

Sure we have implementations with obligatory policy on eBGP - cool. And yes
we have match "ANY" too.

So if your feedback is that to limit the iBGP routes to go out over eBGP
this is all sufficient and we do not need a bit more protection there then
case solved.

Cheers,
R.



On Sun, Aug 2, 2020 at 4:42 PM Ca By  wrote:

>
>
> On Sun, Aug 2, 2020 at 4:34 AM Robert Raszuk  wrote:
>
>> All,
>>
>> Watching this thread with interest got an idea - let me run it by this
>> list before taking it any further (ie. to IETF).
>>
>> How about we learn from this and try to make BGP just a little bit safer
>> ?
>>
>> *Idea: *
>>
>> In all stub (non transit) ASNs we modify BGP spec and disable automatic
>> iBGP to eBGP advertisement ?
>>
>
> Why do you believe a stub AS was involved or that would have changed this
> situation?
>
> The whole point of Noction is for a bad isp to fake more specific routes
> to downstream customers.  Noction is sold to ISPs, aka transit AS, afaik
>
>
>
>> *Implementation: *
>>
>> Vendors to allow to define as part of global bgp configuration if
>> given ASN is transit or not. The default is to be discussed - no bias.
>>
>
> Oh. A configuration knob. Noction had knobs, the world runs of 5 year old
> software with default configs.
>
>
>> *Benefit: *
>>
>> Without any issues anyone playing any tools in his network will be able
>> to just issue one cli
>>
>
> Thanks for no pretending we configure our networks with yang model apis
>
> and be protected from accidentally hurting others. Yet naturally he will
>> still be able to advertise his neworks just as today except by explicit
>> policy in any shape and form we would find proper (example:
>> "redistribute iBGP to eBGP policy-X").
>>
>
> XR rolls this way today, thanks Cisco. But the “any” keyword exists, so
> yolo.
>
>
>> We could even discuss if this should be perhaps part of BGP OPEN or BGP
>> capabilities too such that two sides of eBGP session must agree with each
>> other before bringing eBGP up.
>>
>> Comments, questions, flames - all welcome :)
>>
>> Cheers,
>> Robert.
>>
>> PS. Such a definition sure can and likely will be misused (especially if
>> we would just settle on only a single side setting it - but that will not
>> cause any more harm as not having it at all.
>>
>> Moreover I can already see few other good options which BGP
>> implementation or BGP spec can be augmented with once we know we are stub
>> or for that matter once it knows it is transit 
>>
>>


Re: Issue with Noction IRP default setting (Was: BGP route hijack by AS10990)

2020-08-02 Thread Robert Raszuk
All,

Watching this thread with interest got an idea - let me run it by this list
before taking it any further (ie. to IETF).

How about we learn from this and try to make BGP just a little bit safer ?

*Idea: *

In all stub (non transit) ASNs we modify BGP spec and disable automatic
iBGP to eBGP advertisement ?

*Implementation: *

Vendors to allow to define as part of global bgp configuration if given ASN
is transit or not. The default is to be discussed - no bias.

*Benefit: *

Without any issues anyone playing any tools in his network will be able to
just issue one cli and be protected from accidentally hurting others. Yet
naturally he will still be able to advertise his neworks just as today
except by explicit policy in any shape and form we would find proper
(example: "redistribute iBGP to eBGP policy-X").

We could even discuss if this should be perhaps part of BGP OPEN or BGP
capabilities too such that two sides of eBGP session must agree with each
other before bringing eBGP up.

Comments, questions, flames - all welcome :)

Cheers,
Robert.

PS. Such a definition sure can and likely will be misused (especially if we
would just settle on only a single side setting it - but that will not
cause any more harm as not having it at all.

Moreover I can already see few other good options which BGP implementation
or BGP spec can be augmented with once we know we are stub or for that
matter once it knows it is transit 


Re: Has virtualization become obsolete in 5G?

2020-08-01 Thread Robert Raszuk
>
> I reason that Intel's implication is that virtualization is becoming
> obsolete.
> Would anyone care to let me know his thoughts on this prediction?
>

Virtualization is not becoming obsolete ... quite reverse in fact in all
types of deployments I can see around.

The point is that VM provides hardware virtualization while kubernetes with
containers virtualize OS apps and services are running on in isolation.

Clearly to virtualize operating systems as long as your level of
virtualization mainly in terms of security and resource consumption
isolation & reservation is satisfactory is a much better and lighter
option.

Thx,
R.


Re: questions asked during network engineer interview

2020-07-21 Thread Robert Raszuk
Bill,

> The Software Defined Network concept started as, "Let's use commodity
> hardware running commodity operating systems to form the control plane
> for our network devices."

That's not exactly the real beginning ... the above is more like oh where
do we plug this SDN into and how do we sell it :)

The last churn of SDN as I recall and as explained by Nick McKeown was an
attempt to open innovation into networking ... allowing one to invent
protocols at will as well as setup forwarding tables with arbitrary
switching/routing capabilities as student or operator would only like to
imagine.

That's when the OF was born (with various versions of it) to allow the
hardware and software decoupling.

Well I guess that experiment can be considered as completed today :)

Best,
R.


On Tue, Jul 21, 2020 at 9:22 PM William Herrin  wrote:

> On Mon, Jul 20, 2020 at 9:57 PM Mark Tinka  wrote:
> > Suffice it to say, to this day, we still don't know what SDN means to
> > us, hehe.
>
> Hi Mark,
>
> The Software Defined Network concept started as, "Let's use commodity
> hardware running commodity operating systems to form the control plane
> for our network devices." The concept has expanded somewhat to: "Lets
> use commodity hardware running commodity operating systems AS our
> network devices." For example, if you build a high-rate firewall with
> DPDK on Linux, that's now considered SDN since its commodity hardware,
> commodity OS and custom packet handling (DPDK) that skips the OS.
>
> This is happening a lot in the big shops like Amazon that can afford
> to employ software developers to write purpose-built network code.
>
> Regards,
> Bill Herrin
>
>
> --
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/
>


Re: BFD for long haul circuit

2020-07-17 Thread Robert Raszuk
>  Unfortunately not.

Fortunately  very fortunately Mark.

L2VPNs running on someone's IP backbone sold by many as "circuits" has many
issues ... stability, MTU blackhols, random drops - and that is pretty much
the same all over the world :(

Very unfortunate technology just to mux more users and get more $$$ from
single investment.

Cheers,
R.

On Fri, Jul 17, 2020 at 8:43 AM Mark Tinka  wrote:

>
>
> On 17/Jul/20 02:37, Harivishnu Abhilash wrote:
>
>
>
> Thanks for the update. You have any backhauls, that is running over an L2
> xconnect  ? I’m facing issue only on the backhaul link over a l2vpn ckt.
>
>
> Unfortunately not. All our backbones are either over dark fibre or EoDWDM.
>
> Mark.
>


BGP flooding

2020-06-23 Thread Robert Raszuk
>
> > So to sum it up you simply can not run into any scaling ceiling with
> MP-BGP
> > architecture.
>
> Flooding nature of BGP requires all the related entities treat
> everything, regardless of whether they need it entirely or not.


That is long gone I am afraid ... Hint RFC 4684. Now applicable to more
and more AFI/SAFIs.

Also from day one of L3VPNs, PEs even if receiving all routes were dropping
on inbound (cheap operation) those routes which contained no locally
intersecting RTs.

Thx,
R.


Re: Devil's Advocate - Segment Routing, Why?

2020-06-21 Thread Robert Raszuk
> Wouldn't T-LDP fix this, since LDP LFA is a targeted session?

Nope. You need to get to PQ node via potentially many hops. So you need to
have even ordered or independent label distribution to its loopback in
place.

Best,
R.

On Sun, Jun 21, 2020 at 10:58 PM Mark Tinka  wrote:

>
>
> On 21/Jun/20 22:21, Robert Raszuk wrote:
>
>
> Well this is true for one company :) Name starts with j 
>
> Other company name starting with c - at least some time back by default
> allocated labels for all routes in the RIB either connected or static or
> sourced from IGP. Sure you could always limit that with a knob if desired.
>
>
>
> Juniper allocates labels to the Loopback only.
>
> Cisco allocates labels to all IGP and interface routes.
>
> Neither allocate labels to BGP routes for the global table.
>
>
>
> The issue with allocating labels only for BGP next hops is that your
> IP/MPLS LFA breaks (or more directly is not possible) as you do not have a
> label to PQ node upon failure.  Hint: PQ node is not even running BGP :).
>
>
> Wouldn't T-LDP fix this, since LDP LFA is a targeted session?
>
> Need to test.
>
>
>
> Sure selective folks still count of "IGP Convergence" to restore
> connectivity. But I hope those will move to much faster connectivity
> restoration techniques soon.
>
>
> We are happy :-).
>
> Mark.
>


Re: Devil's Advocate - Segment Routing, Why?

2020-06-21 Thread Robert Raszuk
>
> I should point out that all of my input here is based on simple MPLS
> forwarding of IP traffic in the global table. In this scenario, labels
> are only assigned to BGP next-hops, which is typically an IGP Loopback
> address.
>

Well this is true for one company :) Name starts with j 

Other company name starting with c - at least some time back by default
allocated labels for all routes in the RIB either connected or static or
sourced from IGP. Sure you could always limit that with a knob if desired.

The issue with allocating labels only for BGP next hops is that your
IP/MPLS LFA breaks (or more directly is not possible) as you do not have a
label to PQ node upon failure.  Hint: PQ node is not even running BGP :).

Sure selective folks still count of "IGP Convergence" to restore
connectivity. But I hope those will move to much faster connectivity
restoration techniques soon.


> Labels don't get assigned to BGP routes in a global table. There is no
> use for that.
>

Sure - True.

Cheers,
R,


Re: Devil's Advocate - Segment Routing, Why?

2020-06-21 Thread Robert Raszuk
> The LFIB in each node need only be as large as the number of LDP-enabled
routers in the network.

That is true for P routers ... not so much for PEs.

Please observe that label space in each PE router is divided for IGP and
BGP as well as other label hungy services ... there are many consumers of
local label block.

So it is always the case that LFIB table (max 2^20 entries - 1M) on PEs is
much larger then LFIB on P nodes.

Thx,
R.




On Sun, Jun 21, 2020 at 6:01 PM Mark Tinka  wrote:

>
>
> On 21/Jun/20 15:48, Robert Raszuk wrote:
>
>
>
> Actually when IGP changes LSPs are not recomputed with LDP or SR-MPLS
> (when used without TE :).
>
> "LSP" term is perhaps what drives your confusion --- in LDP MPLS there is
> no "Path" - in spite of the acronym (Labeled Switch *Path*). Labels are
> locally significant and swapped at each LSR - resulting essentially with a
> bunch of one hop crossconnects.
>
> In other words MPLS LDP strictly follows IGP SPT at each LSR hop.
>
>
> Yep, which is what I tried to explain as well. With LDP, MPLS-enabled
> hosts simply push, swap and pop. There is not concept of an "end-to-end
> LSP" as such. We just use the term "LSP" to define an FEC. But really, each
> node in the FEC's path is making its own push, swap and pop decisions.
>
> The LFIB in each node need only be as large as the number of LDP-enabled
> routers in the network. You can get scenarios where FEC's are also created
> for infrastructure links, but if you employ filtering to save on FIB slots,
> you really just need to allocate labels to Loopback addresses only.
>
> Mark.
>


Re: Devil's Advocate - Segment Routing, Why?

2020-06-21 Thread Robert Raszuk
> I'm saying that, if some failure occurs and IGP changes, a
> lot of LSPs must be recomputed, which does not scale
> if # of LSPs is large, especially in a large network
> where IGP needs hierarchy (such as OSPF area).
>
> Masataka Ohta
>


Actually when IGP changes LSPs are not recomputed with LDP or SR-MPLS (when
used without TE :).

"LSP" term is perhaps what drives your confusion --- in LDP MPLS there is
no "Path" - in spite of the acronym (Labeled Switch *Path*). Labels are
locally significant and swapped at each LSR - resulting essentially with a
bunch of one hop crossconnects.

In other words MPLS LDP strictly follows IGP SPT at each LSR hop.

Many thx,
R.


Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)

2020-06-21 Thread Robert Raszuk
Let's clarify a few things ...

On Sun, Jun 21, 2020 at 2:39 PM Masataka Ohta <
mo...@necom830.hpcl.titech.ac.jp> wrote:

If all the link-wise (or, worse, host-wise) information of possible
> destinations is distributed in advance to all the possible sources,
> it is not hierarchical but flat (host) routing, which scales poorly.
>
> Right?
>

Neither link wise nor host wise information is required to accomplish say
L3VPN services. Imagine you have three sites which would like to
interconnect each with 1000s of users.

So all you are exchanging as part of VPN overlay is three subnets.

Moreover if you have 1000 PEs and those three sites are attached only to 6
of them - only those 6 PEs will need to learn those routes (Hint: RTC -
RFC4684)

It is because detailed information to reach destinations
> below certain level is advertised not globally but only for
> small part of the network around the destinations.
>

Same thing here.


> That is, with hierarchical routing, detailed information
> around destinations is actively hidden from sources.
>

  Same thing here.

That is why as described we use label stack. Top label is responsible to
get you to the egress PE. Service label sitting behind top label is
responsible to get you  through to the customer site (with or without IP
lookup at egress PE).


> So, with hierarchical routing, routing protocols can
> carry only rough information around destinations, from
> which, source side can not construct detailed (often
> purposelessly nested) labels required for MPLS.
>

Usually sources have no idea of MPLS. MPLS to the host never took off.


> According to your theory to ignore routing traffic, we can be happy
> with global *host* routing table with 4G entries for IPv4 and a lot
> lot lot more than that for IPv6. CIDR should be unnecessary
> complication to the Internet
>

I do not think any one saying it here.


> With nested labels, you don't need so much labels at certain nesting
> level, which was the point of Yakov, which does not mean you don't
> need so much information to create entire nested labels at or near
> the sources.
>

Label stack is here from day one. Each layer of the stack has a completely
different role. That is your hierarchy.

Kind regards,
R.


Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)

2020-06-21 Thread Robert Raszuk
It is destination based flat routing distributed 100% before any data
packet within each layer - yes. But layers are decoupled so in a sense this
is what defines a hierarchy overall.

So transport is using MPLS LSPs most often hosts IGP routes are matched
with LDP FECs and flooded everywhere in spite of RFC 5283 at least allowing
to aggregate IGP.

Then say L2VPNs or L3VPNs with their own choice of routing protocols are in
turn distributing reachability for the customer sites. Those are service
routes linked to transport by BGP next hop(s).

Many thx,
R.


On Sun, Jun 21, 2020 at 1:11 PM Masataka Ohta <
mo...@necom830.hpcl.titech.ac.jp> wrote:

> Robert Raszuk wrote:
>
> > MPLS LDP or L3VPNs was NEVER flow driven.
> >
> > Since day one till today it was and still is purely destination based.
>
> If information to create labels at or near sources to all the
> possible destinations is distributed in advance, may be. But
> it is effectively flat routing, or, in extreme cases, flat host
> routing.
>
> Or, if information to create labels to all the active destinations
> is supplied on demand, it is flow driven.
>
> On day one, Yakov said MPLS had scaled because of nested labels
> corresponding to routing hierarchy.
>
> Masataka Ohta
>


Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)

2020-06-20 Thread Robert Raszuk
> The problem of MPLS, however, is that, it must also be flow driven,
> because detailed route information at the destination is necessary
> to prepare nested labels at the source, which costs a lot and should
> be attempted only for detected flows.
>

MPLS is not flow driven. I sent some mail about it but perhaps it bounced.

MPLS LDP or L3VPNs was NEVER flow driven.

Since day one till today it was and still is purely destination based.

Transport is using LSP to egress PE (dst IP).

L3VPNs are using either per dst prefix, or per CE or per VRF labels. No
implementation does anything upon "flow detection" - to prepare any nested
labels. Even in FIBs all information is preprogrammed in hierarchical
fashion well before any flow packet arrives.

Thx,
R.




>
>  > there is the argument that switching MPLS is faster than IP; when the
>  > pressure points i see are more at routing (BGP/LDP/RSVP/whatever),
>  > recovery, and convergence.
>
> Routing table at IPv4 backbone today needs at most 16M entries to be
> looked up by simple SRAM, which is as fast as MPLS look up, which is
> one of a reason why we should obsolete IPv6.
>
> Though resource reserved flows need their own routing table entries,
> they should be charged proportional to duration of the reservation,
> which can scale to afford the cost to have the entries.
>
> Masataka Ohta
>
>


Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)

2020-06-20 Thread Robert Raszuk
> there is saku's point of distributing labels in IGP TLVs/LSAs.  i
> suspect he is correct, but good luck getting that anywhere in the
> internet vendor task force.

Perhaps I will surprise a few but this is not only already in RFC formats -
it is also shipping already across vendors for some time now.

SR-MPLS (as part of its spec) does exactly that. You do not need to use any
SR if you do not want, you still can encapsulate your packets with
transport label corresponding to your exit at any ingress and forget about
LDP for good.

But with that let's not forget that aggregation here is still not spec-ed
out well and to the best of my knowledge it is also not shipping yet. I
recently proposed an idea how to aggregate SRGBs .. one vendor is analyzing
it.

Best,
R.



On Sat, Jun 20, 2020 at 1:33 AM Randy Bush  wrote:

> < ranting of a curmudgeonly old privileged white male >
>
> >>> MPLS was since day one proposed as enabler for services originally
> >>> L3VPNs and RSVP-TE.
> >> MPLS day one was mike o'dell wanting to move his city/city traffic
> >> matrix from ATM to tag switching and open cascade's hold on tags.
> > And IIRC, Tag switching day one was Cisco overreacting to Ipsilon.
>
> i had not thought of it as overreacting; more embrace and devour.  mo
> and yakov, aided and abetted by sob and other ietf illuminati, helped
> cisco take the ball away from Ipsilon, Force10, ...
>
> but that is water over the damn, and my head is hurting a bit from
> thinking on too many levels at once.
>
> there is saku's point of distributing labels in IGP TLVs/LSAs.  i
> suspect he is correct, but good luck getting that anywhere in the
> internet vendor task force.  and that tells us a lot about whether we
> can actually effect useful simplification and change.
>
> is a significant part of the perception that there is a forwarding
> problem the result of the vendors, 25 years later, still not
> designing for v4/v6 parity?
>
> there is the argument that switching MPLS is faster than IP; when the
> pressure points i see are more at routing (BGP/LDP/RSVP/whatever),
> recovery, and convergence.
>
> did we really learn so little from IP routing that we need to
> recreate analogous complexity and fragility in the MPLS control
> plane?  ( sound of steam eminating from saku's ears :)
>
> and then there is buffering; which seems more serious than simple
> forwarding rate.  get it there faster so it can wait in a queue?  my
> principal impression of the Stanford/Google workshops was the parable
> of the blind men and the elephant.  though maybe Matt had the main
> point: given scaling 4x, Moore's law can not save us and it will all
> become paced protocols.  will we now have a decade+ of BBR evolution
> and tuning?  if so, how do we engineer our networks for that?
>
> and up 10,000m, we watch vendor software engineers hand crafting in
> an assembler language with if/then/case/for, and running a chain of
> checking software to look for horrors in their assembler programs.
> it's the bleeping 21st century.  why are the protocol specs not
> formal and verified, and the code formally generated and verified?
> and don't give me too slow given that the hardware folk seem to be
> able to do 10x in the time it takes to run valgrind a few dozen
> times.
>
> we're extracting ore with hammers and chisels, and then hammering it
> into shiny objects rather than safe and securable network design and
> construction tools.
>
> apologies.  i hope you did not read this far.
>
> randy
>


Re: Devil's Advocate - Segment Routing, Why?

2020-06-19 Thread Robert Raszuk
>
> One of the advantages cited for SRv6 over MPLS is that the packet contains
>> a record of where it has been.
>>
>
Not really ... packets are not tourists in a bus.

First there are real studies proving that most large production networks
for the goal of good TE only need to place 1, 2 or 3 hops to traverse
through. Rest is the shortest path between those hops.

Then even if you place those node SIDs you have no control which interfaces
are chosen as outbound. There is often more then one IGP ECMP path in
between. You would need to insert adj. SIDs which does require pretty fine
level of controller's capabilities to start with.

I just hope that no one sane proposes that now all packets should get
encapsulated in a new IPv6 header while entering a transit ISP network and
carry long list of hop by hop adjacencies it is to travel by. Besides even
if it would it would be valid only within given ASN and had no visibility
outside.

Thx,
R.


Re: Devil's Advocate - Segment Routing, Why?

2020-06-19 Thread Robert Raszuk
>
> But, today, people are seems to be using, so called, MPLS, with
>
explicitly configured flows, administration of which does not
> scale and is annoying.
>

I am actually not sure what you are talking about here.

The only per flow action in any MPLS deployments I have seen was mapping
flow groups to specific TE-LSPs. In all other TDP or LDP cases flow == IP
destination so it is exact based on the destination reachability. And such
mapping is based on the LDP FEC to IGP (or BGP) match.

Even worse, if route near the destination expected to pop the label
> chain goes down, how can the source knows that the router goes down
> and choose alternative router near the destination?
>

In normal MPLS the src does not pick the transit paths. Transit is 100%
driven by IGP and if you loose a node local connectivity restoration
techniques (FRR or IGP convergence applies). If egress signalled
implicit NULL it would signal it to any IGP peer.

That is also possible with SR-MPLS too. No change ... no per flow state at
all more then per IP destination routing. If you want to control your
transit hops you can - but this is an option not w requirement.

MPLS with hierarchical routing just does not scale.


While I am not defending MPLS here and 100% agree that IP as transit is a
much better option today and tomorrow I also would like to make sure we
communicate true points. So when you say it does not scale - it could be
good to list what exactly does not scale by providing a real network
operational example.

Many thx,
R.


Re: Devil's Advocate - Segment Routing, Why?

2020-06-19 Thread Robert Raszuk
Hi Mark,

As actually someone who was at that table you are referring to - I must say
that MPLS was never proposed as replacement for IP.

MPLS was since day one proposed as enabler for services originally L3VPNs
and RSVP-TE. Then bunch of others jumped on the same encapsulation train.
If at that very time GSR would be able to do right GRE encapsulation at
line rate in all of its engines MPLS for transport would never take off. As
service demux - sure but this is completely separate.

But since at that time shipping hardware could not do the right
encapsulation and since SPs were looking for more revenue and new way to
move ATM and FR customers to IP backbones L3VPN was proposed which really
required to hide the service addresses from anyone's core. So some form of
encapsulation was a MUST. Hence tag switching then mpls switching was
rolled out.

So I think Ohta-san's point is about scalability services not flat underlay
RIB and FIB sizes. Many years ago we had requests to support 5M L3VPN
routes while underlay was just 500K IPv4.

Last - when I originally discussed just plain MPLS with customers with
single application of hierarchical routing (no BGP in the core) frankly no
one was interested. Till L3VPN arrived which was game changer and run for
new revenue streams ...

Best,
R.


On Fri, Jun 19, 2020 at 5:00 PM Mark Tinka  wrote:

>
>
> On 19/Jun/20 16:45, Masataka Ohta wrote:
>
> > The problem of MPLS, or label switching in general, is that, though
> > it was advertised to be topology driven to scale better than flow
> > driven, it is actually flow driven with poor scalability.
> >
> > Thus, it is impossible to deploy any technology scalably over MPLS.
> >
> > MPLS was considered to scale, because it supports nested labels
> > corresponding to hierarchical, thus, scalable, routing table.
> >
> > However, to assign nested labels at the source, the source
> > must know hierarchical routing table at the destination, even
> > though the source only knows hierarchical routing table at
> > the source itself.
> >
> > So, the routing table must be flat, which dose not scale, or
> > the source must detect flows to somehow request hierarchical
> > destination routing table on demand, which means MPLS is flow
> > driven.
> >
> > People, including some data center people, avoiding MPLS, know
> > network scalability better than those deploying MPLS.
> >
> > It is true that some performance improvement is possible with
> > label switching by flow driven ways, if flows are manually
> > detected. But, it means extra label-switching-capable equipment
> > and administrative effort to detect flows, neither of which do
> > not scale and cost a lot.
> >
> > It cost a lot less to have more plain IP routers than insisting
> > on having a little fewer MPLS routers.
>
> I wouldn't agree.
>
> MPLS is a purely forwarding paradigm, as is hop-by-hop IP. Even with
> hop-by-hop IP, you need the edge to be routing-aware.
>
> I wasn't at the table when the MPLS spec. was being dreamed up, but I'd
> find it very hard to accept that someone drafting the idea advertised it
> as being a replacement or alternative for end-to-end IP routing and
> forwarding.
>
> Whether you run MPLS or not, you will always have routing table scaling
> concerns. So I'm not quite sure how that is MPLS's problem. If you can
> tell me how NOT running MPLS affords you a "hierarchical, scalable"
> routing table, I'm all ears.
>
> Whether you forward in IP or in MPLS, scaling routing is an ever clear &
> present concern. Where MPLS can directly mitigate that particular
> concern is in the core, where you can remove BGP. But you still need
> routing in the edge, whether you forward in IP or MPLS.
>
> Mark.
>
>


Re: Devil's Advocate - Segment Routing, Why?

2020-06-18 Thread Robert Raszuk
Hi Saku,

To your IGP point let me observe that OSPF runs over IP and ISIS does not.
That is first fundamental difference. There are customers using both all
over the world and therefore any suggestion to just use OSPFv3 is IMHO
quite unrealistic. Keep in mind that OSPF hierarchy is 2 (or 3 with super
area) while in IETF there is ongoing work to extend ISIS to 8 levels. There
is a lot of fundamental differences between those two (or three) IGPs and I
am sure many folks on the lists know them. Last there is a lot of
enterprise networks happily using IPv4 RFC1918 all over their global WAN
and DCs infrastructure and have no reason to deploy IPv6 there any time
soon.

If you are serious about converging to a single IGP I would rather consider
look towards OpenR type of IGP architecture with message bus underneath.

Thx,
R.

On Thu, Jun 18, 2020 at 7:26 AM Saku Ytti  wrote:

> On Thu, 18 Jun 2020 at 01:17, Mark Tinka  wrote:
>
> > IOS XR does not appear to support SR-OSPFv3.
> > IOS XE does not appear to support SR-ISISv6.
> > IOS XE does not appear to support SR-OSPFv3.
> > Junos does not appear to support SR-OSPFv3.
>
> The IGP mess we are in is horrible, but I can't blame SR for it. It's
> really unacceptable we spend NRE hours developing 3 identical IGP
> (OSPFv2, OSPFv3, ISIS). We all pay a 300-400% premium for a single
> IGP.
>
> In a sane world, we'd retire all of them except OSPFv3 and put all NRE
> focus on there or move some of the NRE dollars to some other problems
> we have, perhaps we would have room to support some different
> non-djikstra IGP.
>
> In a half sane world, IGP code, 90% of your code would be identical,
> then you'd have adapter/ospfv2 adapter/ospfv3 adapter/isis which
> translates internal struct to wire and wire to internal struct. So any
> features you code, come for free to all of them. But no one is doing
> this, it's 300% effort, and we all pay a premium for that.
>
> In a quarter sane world we'd have some CIC, common-igp-container RFC
> and then new features like SR would be specified as CIC-format,
> instead of OSPFv2, OSPFv3, ISIS and BGP. Then each OSPFv2, OSPFv3,
> ISIS and BGP would have CIC-to-x RFC. So people introducing new IGP
> features do not need to write 4 drafts, one is enough.
>
> I would include IPv4+IPv6 my-igp-of-choice SR in my RFP. Luckily ISIS
> is supported on platforms I care about for IPV4+IPV6, so I'm already
> there.
>
> > MPLS/VPN service signaling in IPv6-only networks also has gaps in SR.
>
> I don't understand this.
>
>
> > So for networks that run OSPF and don't run Juniper, they'd need to move
> to IS-IS in order to have SR forward IPv6 traffic in an MPLS encapsulation.
> Seems like a bit of an ask. Yes, code needs to be written, which is fine by
> me, as it also does for LDPv6.
>
> And it's really just adding TLV, if it already does IPv4 all the infra
> should be in place, only  thing missing is transporting the
> information. Adding TLV to IGP is a lot less work than LDPv6.
>
> > I'd be curious to understand what bugs you've suffered with LDP in the
> last 10 or so years, that likely still have open tickets.
>
> 3 within a year.
> - PR1436119
> - PR1428081
> - PR1416032
>
> I don't have IOS-XR LDP bugs within a year, but we had a bunch back
> when going from 4 to 5. And none of these are cosmetic, these are
> blackholing.
>
> I'm not saying LDP is bad, it's just, of course more code lines you
> exercise more bugs you see.
>
> But yes, LDP has a lot of bug surface compared to SR, but in _your
> network_ lot of that bug surface and complexity is amortised
> complexity. So status quo bias is strong to keep running LDP, it is
> simpler _NOW_ as a lot of the tax has been paid and moving to an
> objectively simpler solution carries risk, as its complexity is not
> amortised yet.
>
>
> > Yes, we all love less state, I won't argue that. But it's the same
> question that is being asked less and less with each passing year - what
> scales better in 2020, OSPF or IS-IS. That is becoming less relevant as
> control planes keep getting faster and cheaper.
>
> I don't think it ever was relevant.
>
> > I'm not saying that if you are dealing with 100,000 T-LDP sessions you
> should not consider SR, but if you're not, and SR still requires a bit more
> development (never mind deployment experience), what's wrong with having
> LDPv6? If it makes near-as-no-difference to your control plane in 2020 or
> 2030 as to whether your 10,000-node network is running LDP or SR, why not
> have the choice?
>
> I can't add anything to the upside of going from LDP to SR that I've
> not already said. You get more by spending less, it's win:win. Only
> reason to stay in LDP is status quo bias which makes short term sense.
>
> > Routers, in 2020, still ship with RIPv2. If anyone wants to use it (as I
> am sure there are some that do), who are we to stand in their way, if it
> makes sense for them?
>
> RIP might make sense in some deployments, because it's essentially
> stateless 

Re: Devil's Advocate - Segment Routing, Why?

2020-06-17 Thread Robert Raszuk
>
> Anything that can support LDPv4 today can support LDPv6, in hardware.
>

While I am trying to stay out of this interesting discussion the above
statement is not fully correct.

Yes in the MPLS2MPLS path you are correct,

But ingress and egress switching vectors are very different for LDPv6 as
you need to match on IPv6 vs LDPv4 ingress where you match on IPv4 to map
it to correct label stack rewrite.

Example: If your hardware ASICs do not support IPv6 while support IPv4 -
LDPv4 will work just fine while LDPv6 will have a rather a bit of hard time
:)

Cheers,
R.