Re: few big monolithic PEs vs many small PEs

2019-07-02 Thread Mark Tinka



On 27/Jun/19 21:41, James Bensley wrote:

>
>
> Large boxes like the MX2020, ASR9922, NCS6K, etc. these can only reasonably 
> be used as P nodes in my opinion.

The NCS6000 was always designed as a core router to replace the CRS. We
just haven't seen the need for one since the CRS-X we run we operate
(8-slot chassis) is still more than enough for our requirements.

But yes, all of these edge routers, nowadays, are very decent core boxes
also, particularly if you run a BGP-free core and have no need to
support non-Ethernet links to any reasonable degree in there.

Mark.



Re: few big monolithic PEs vs many small PEs

2019-06-29 Thread Mark Tinka



On 28/Jun/19 10:35, adamv0...@netconsultings.com wrote:

> If the PEs are sufficiently small I'd even go further as to L3VPNs-PE vs 
> L2VPNs-PE services etc...,  it's mostly because of streamlined/simplified hw 
> and code certification testing.
> But as with all the decentralize-centralize swings one has to strike the 
> balance just right and weight the aggregation pros against too many eggs in 
> one basket cons.

On the VPN side, we sell more l2vpn then l3vpn. In fact, I don't believe
we've actually sold an l3vpn service, apart from the one we built to
deliver voice services.

l3vpn is a dying service in Africa. With everything in the cloud now,
everybody just wants a simple IP service.

Mark.


RE: few big monolithic PEs vs many small PEs

2019-06-28 Thread adamv0025
> Mark Tinka
> Sent: Thursday, June 27, 2019 4:31 PM
> 
> 
> 
> On 27/Jun/19 14:48, James Bensley wrote:
> 
> > That to me is a simple scenario, and it can be mapped with a
> > dependency tree. But in my experience, and maybe it's just me, things
> > are usually a lot more complicated than this. The root cause is
> > probably a bad design introducing too much complexity, which is
> > another vote for smaller PEs from me. With more service dedicated PEs
> > one can reduce or remove the possibility of piling multiple services
> > and more complexity onto the same PE(s).
> 
> Which is one of the reasons we - painfully to the bean counters - insist that
> routers are deployed for function.
> 
> We won't run peering and transit services on the same router.
> 
> We won't run SP and Enterprise on the same router as Broadband.
> 
> We won't run supporting services (DNS, RADIUS, WWW, FTP, Portals, NMS,
> e.t.c.) on the same router where we terminate customers.
> 
> This level of distribution, although quite costly initially, means you reduce 
> the
> inter-dependency of services at a hardware level, and can safely keep things
> apart so that when bits fail, you aren't committing other services to the same
> fate.
> 
If the PEs are sufficiently small I'd even go further as to L3VPNs-PE vs 
L2VPNs-PE services etc...,  it's mostly because of streamlined/simplified hw 
and code certification testing.
But as with all the decentralize-centralize swings one has to strike the 
balance just right and weight the aggregation pros against too many eggs in one 
basket cons.

adam



RE: few big monolithic PEs vs many small PEs

2019-06-28 Thread adamv0025
Hi James,

> From: James Bensley 
> Sent: Thursday, June 27, 2019 1:48 PM
> 
> On Thu, 27 Jun 2019 at 12:46,  wrote:
> >
> > > From: James Bensley 
> > > Sent: Thursday, June 27, 2019 9:56 AM
> > >
> > > One experience I have made is that when there is an outage on a
> > > large PE, even when it still has spare capacity, is that the
> > > business impact can be too much to handle (the support desk is
> > > overwhelmed, customers become irate if you can't quickly tell them
> > > what all the impacted services are, when service will be restored,
> > > the NMS has so many alarms it’s not clear what the problem is or where
> it's coming from etc.).
> > >
> > I see what you mean, my hope is to address these challenges by having a
> "single source of truth" provisioning system that will have, among other
> things, also HW-customer/service mapping -so Ops team will be able to say
> that if particular LC X fails then customers/services X,Y,Z will be affected.
> > But yes I agree with smaller PEs any failure fallout is minimized
> proportionally.
> 
> Hi Adam,
> 
> My experience is that it is much more complex than that (although it also
> depends on what sort of service you're offering), one can't easily model the
> inter-dependency between multiple physical assets like links, interfaces, line
> cards, racks, DCs etc and logical services such as a VRFs/L3VPNs, cloud hosted
> proxies and the P edge.
> 
> Consider this, in my opinion, relatively simple example:
> Three PEs in a triangle. Customer is dual-homed to PE1 and PE2 and their link
> to PE1 is their primary/active link. Transit is dual-homed to PE2 and PE3 and
> your hosted filtering service cluster is also dual-homed to PE2 and PE3 to be
> near the Internet connectivity.
> 
I agree the scenario you proposed is perfectly valid seems simple but might 
contain high degree of complexity in terms of traffic patterns.
Thinking about this I'd propose to separate the problem into two parts,

The simpler one to solve is the physical resource allocation part of the 
problem 
This is where the hierarchical record of physical assets could give us the 
right answers to what happens if this card fails 
(example of hierarchy POP->PE->LineCard->PhysicalPort(s)-> 
PhysicalPort(s)->Aggregation-SW->PhysicalPort(s)->Customer/Service)

The other part of the problem is much harder and has two sub parts: 
-first subpart is to model interactions between number of protocols to 
accurately predict traffic patterns under various failure conditions.
(I'd argue that this to some extent should be part of the design documentation 
and well understood and tested during POC testing for a new design -although 
entropy...)
-And now the tricky subpart is to be able to map individual 
customer->service/service->customer traffic flows onto the first subpart 
(This subpart I didn't give much thought so can't possibly comment )  

adam   



Re: few big monolithic PEs vs many small PEs

2019-06-28 Thread Mark Tinka


On 28/Jun/19 01:23, Mike Hammett wrote:

> I've ran into many providers where they had routers in the top 10 or
> 15 markets...  and that was it. If you wanted a connection in South
> Bend or Indianapolis or New Orleans or Ohio or...  you were backhauled
> potentially hundreds of miles to a nearby big market.
>
> More smaller POPs reduces the tromboning.
>
> More smaller POPs means that one POP's outage isn't as disastrous on
> the traffic rerouting around it.

I really dislike centralized routing.

Mark.


Re: few big monolithic PEs vs many small PEs

2019-06-27 Thread Mike Hammett
Big routers also mean they're a lot more expensive. You have to squeeze more 
life out of them because they cost you hundreds of thousands of dollars. You 
run them longer than you really should. 


If you run more, smaller, $20k or $30k routers, you'll replace them on a more 
reasonable cycle. 




- 
Mike Hammett 
Intelligent Computing Solutions 

Midwest Internet Exchange 

The Brothers WISP 

- Original Message -

From: adamv0...@netconsultings.com 
To: nanog@nanog.org 
Sent: Wednesday, June 19, 2019 3:22:45 PM 
Subject: few big monolithic PEs vs many small PEs 

Hi folks, 

Recently I ran into a peculiar situation where we had to cap couple of PE 
even though merely a half of the rather big chassis was populated with 
cards, reason being that the central RE/RP was not able to cope with the 
combined number of routes/vrfs/bgp sessions/etc.. 

So this made me think about the best strategy in building out SP-Edge 
nowadays (yes I'm aware of the centralize/decentralize pendulum swinging 
every couple of years). 
The conclusion I came to was that *currently the best approach would be to 
use several medium to small(fixed) PEs to replace a big monolithic chasses 
based system. 
So what I was thinking is, 
Yes it will cost a bit more (router is more expensive than a LC) 
Will end up with more prefixes in IGP, more BGP sessions etc.. -don't care. 
But the benefits are less eggs in one basket, simplified and hence faster 
testing in case of specialized PEs and obviously better RP CPU/MEM to port 
ratio. 
Am I missing anything please? 

*currently, 
Yes some old chassis systems or even multi-chassis systems used to support 
additional RPs and offloading some of the processes (e.g. BGP onto those) 
-problem is these are custom hacks and still a single OS which needs 
rebooting LC/ASICs when being upgraded -so the problem of too many eggs in 
one basket still exists (yes cisco NCS6k and recent ASR9k lightspeed LCs are 
an exception) 
And yes there is the "node-slicing" approach from Juniper where one can 
offload CP onto multiple x86 servers and assign LCs to each server (virtual 
node) - which would solve my chassis full problem -but honestly how many of 
you are running such setup? Exactly. And that's why I'd be hesitant to 
deploy this solution in production just yet. I don't know of any other 
vendor solution like this one, but who knows maybe in 5 years this is going 
to be the new standard. Anyways I need a solution/strategy for the next 3-5 
years. 


Would like to hear what are your thoughts on this conundrum. 

adam 

netconsultings.com 
::carrier-class solutions for the telecommunications industry:: 





Re: few big monolithic PEs vs many small PEs

2019-06-27 Thread Mike Hammett
I've ran into many providers where they had routers in the top 10 or 15 
markets... and that was it. If you wanted a connection in South Bend or 
Indianapolis or New Orleans or Ohio or... you were backhauled potentially 
hundreds of miles to a nearby big market. 


More smaller POPs reduces the tromboning. 


More smaller POPs means that one POP's outage isn't as disastrous on the 
traffic rerouting around it. 




- 
Mike Hammett 
Intelligent Computing Solutions 

Midwest Internet Exchange 

The Brothers WISP 

- Original Message -

From: adamv0...@netconsultings.com 
To: nanog@nanog.org 
Sent: Wednesday, June 19, 2019 3:22:45 PM 
Subject: few big monolithic PEs vs many small PEs 

Hi folks, 

Recently I ran into a peculiar situation where we had to cap couple of PE 
even though merely a half of the rather big chassis was populated with 
cards, reason being that the central RE/RP was not able to cope with the 
combined number of routes/vrfs/bgp sessions/etc.. 

So this made me think about the best strategy in building out SP-Edge 
nowadays (yes I'm aware of the centralize/decentralize pendulum swinging 
every couple of years). 
The conclusion I came to was that *currently the best approach would be to 
use several medium to small(fixed) PEs to replace a big monolithic chasses 
based system. 
So what I was thinking is, 
Yes it will cost a bit more (router is more expensive than a LC) 
Will end up with more prefixes in IGP, more BGP sessions etc.. -don't care. 
But the benefits are less eggs in one basket, simplified and hence faster 
testing in case of specialized PEs and obviously better RP CPU/MEM to port 
ratio. 
Am I missing anything please? 

*currently, 
Yes some old chassis systems or even multi-chassis systems used to support 
additional RPs and offloading some of the processes (e.g. BGP onto those) 
-problem is these are custom hacks and still a single OS which needs 
rebooting LC/ASICs when being upgraded -so the problem of too many eggs in 
one basket still exists (yes cisco NCS6k and recent ASR9k lightspeed LCs are 
an exception) 
And yes there is the "node-slicing" approach from Juniper where one can 
offload CP onto multiple x86 servers and assign LCs to each server (virtual 
node) - which would solve my chassis full problem -but honestly how many of 
you are running such setup? Exactly. And that's why I'd be hesitant to 
deploy this solution in production just yet. I don't know of any other 
vendor solution like this one, but who knows maybe in 5 years this is going 
to be the new standard. Anyways I need a solution/strategy for the next 3-5 
years. 


Would like to hear what are your thoughts on this conundrum. 

adam 

netconsultings.com 
::carrier-class solutions for the telecommunications industry:: 





Re: few big monolithic PEs vs many small PEs

2019-06-27 Thread James Bensley



On 27 June 2019 16:26:03 BST, Mark Tinka  wrote:
>
>
>On 27/Jun/19 10:58, James Bensley wrote:
>
>> Hi Adam,
>>
>> Over the years I have been bitten multiple times by having fewer big
>> routers with either far too many services/customers connected to them
>> or too much traffic going through them. These days I always go for
>> more smaller/more routers than fewer/larger routers.
>>
>> One experience I have made is that when there is an outage on a large
>> PE, even when it still has spare capacity, is that the business
>impact
>> can be too much to handle (the support desk is overwhelmed, customers
>> become irate if you can't quickly tell them what all the impacted
>> services are, when service will be restored, the NMS has so many
>> alarms it’s not clear what the problem is or where it's coming from
>> etc.).
>>
>> I’ve seen networks place change freeze on devices, with the exception
>> of changes that migrate customers or services off of the PE, because
>> any outage would create too great an impact to the business, or risk
>> the customers terminating their contract. I’ve also seen changes
>> freeze be placed upon large PEs because the complexity was too great,
>> trying to work out the impact of a change on one of the original PEs
>> from when the network was first built, which is somehow linked to
>> virtually every service on the network in some obscure and
>> unforeseeable way.
>
>I would tend to agree when the edge routers are massive, e.g., boxes
>like the Cisco ASR9922 or the Juniper MX2020 are simply too large, and
>present a real risk re: that level of customer aggregation (even for
>low-revenue services such as Broadband). I don't think I'd ever justify
>buying these towers to aggregate customers, mainly due to the risk.
>
>For us, even the MX960 is too big, which is why we focus on the MX480
>(ASR9906 being the equivalent). It's a happy medium between the small
>and large end of the spectrum.
>
>And as I mentioned before, we just look at a totally different box for
>100Gbps customers.
>
>Mark.

Yeah, if you want to name specific boxes then yes I've made similar experiences 
with the same boxen. Even the MX960 is slightly too big for a PE depending on 
how you load it (port combinations).

Large boxes like the MX2020, ASR9922, NCS6K, etc. these can only reasonably be 
used as P nodes in my opinion.

Cheers,
James.


Re: few big monolithic PEs vs many small PEs

2019-06-27 Thread James Bensley
On 27 June 2019 16:31:27 BST, Mark Tinka  wrote:
>
>
>On 27/Jun/19 14:48, James Bensley wrote:
>
>> That to me is a simple scenario, and it can be mapped with a
>> dependency tree. But in my experience, and maybe it's just me, things
>> are usually a lot more complicated than this. The root cause is
>> probably a bad design introducing too much complexity, which is
>> another vote for smaller PEs from me. With more service dedicated PEs
>> one can reduce or remove the possibility of piling multiple services
>> and more complexity onto the same PE(s).
>
>Which is one of the reasons we - painfully to the bean counters -
>insist
>that routers are deployed for function.
>
>We won't run peering and transit services on the same router.
>
>We won't run SP and Enterprise on the same router as Broadband.
>
>We won't run supporting services (DNS, RADIUS, WWW, FTP, Portals, NMS,
>e.t.c.) on the same router where we terminate customers.
>
>This level of distribution, although quite costly initially, means you
>reduce the inter-dependency of services at a hardware level, and can
>safely keep things apart so that when bits fail, you aren't committing
>other services to the same fate.
>
>Mark.

Agreed. This is worked well for me over time.

It's costly in the initial capex out-lay but these boxes will have different 
upgrade/capacity increase times and price points, so over time everything 
spreads out.

Massive iron upgrades require biblical business cases and epic battles to get 
the funds approved. Periodic small to medium PE upgrades are nicer on the 
annual budget and the forecasting.

Cheers,
James.


Re: few big monolithic PEs vs many small PEs

2019-06-27 Thread Mark Tinka



On 27/Jun/19 14:48, James Bensley wrote:

> That to me is a simple scenario, and it can be mapped with a
> dependency tree. But in my experience, and maybe it's just me, things
> are usually a lot more complicated than this. The root cause is
> probably a bad design introducing too much complexity, which is
> another vote for smaller PEs from me. With more service dedicated PEs
> one can reduce or remove the possibility of piling multiple services
> and more complexity onto the same PE(s).

Which is one of the reasons we - painfully to the bean counters - insist
that routers are deployed for function.

We won't run peering and transit services on the same router.

We won't run SP and Enterprise on the same router as Broadband.

We won't run supporting services (DNS, RADIUS, WWW, FTP, Portals, NMS,
e.t.c.) on the same router where we terminate customers.

This level of distribution, although quite costly initially, means you
reduce the inter-dependency of services at a hardware level, and can
safely keep things apart so that when bits fail, you aren't committing
other services to the same fate.

Mark.


Re: few big monolithic PEs vs many small PEs

2019-06-27 Thread Mark Tinka



On 27/Jun/19 14:03, adamv0...@netconsultings.com wrote:

> I believe it would, for a time, but it would require SW upgrade -testing 
> etc.. even newer SW in itself gave us better resource management and 
> performance optimizations.
> However even with powerful CP and streamlined SW we'd be still just buying 
> time while pushing the envelope. 
> Hence the decentralization at the edge seems like a natural strategy to exit 
> the uroboros paradigm.

Well, this is one area where I can't meaningfully add value, since you
know your environment better than anyone else on this list.

Mark.


Re: few big monolithic PEs vs many small PEs

2019-06-27 Thread Mark Tinka



On 27/Jun/19 10:58, James Bensley wrote:

> Hi Adam,
>
> Over the years I have been bitten multiple times by having fewer big
> routers with either far too many services/customers connected to them
> or too much traffic going through them. These days I always go for
> more smaller/more routers than fewer/larger routers.
>
> One experience I have made is that when there is an outage on a large
> PE, even when it still has spare capacity, is that the business impact
> can be too much to handle (the support desk is overwhelmed, customers
> become irate if you can't quickly tell them what all the impacted
> services are, when service will be restored, the NMS has so many
> alarms it’s not clear what the problem is or where it's coming from
> etc.).
>
> I’ve seen networks place change freeze on devices, with the exception
> of changes that migrate customers or services off of the PE, because
> any outage would create too great an impact to the business, or risk
> the customers terminating their contract. I’ve also seen changes
> freeze be placed upon large PEs because the complexity was too great,
> trying to work out the impact of a change on one of the original PEs
> from when the network was first built, which is somehow linked to
> virtually every service on the network in some obscure and
> unforeseeable way.

I would tend to agree when the edge routers are massive, e.g., boxes
like the Cisco ASR9922 or the Juniper MX2020 are simply too large, and
present a real risk re: that level of customer aggregation (even for
low-revenue services such as Broadband). I don't think I'd ever justify
buying these towers to aggregate customers, mainly due to the risk.

For us, even the MX960 is too big, which is why we focus on the MX480
(ASR9906 being the equivalent). It's a happy medium between the small
and large end of the spectrum.

And as I mentioned before, we just look at a totally different box for
100Gbps customers.

Mark.


Re: few big monolithic PEs vs many small PEs

2019-06-27 Thread James Bensley
On Thu, 27 Jun 2019 at 12:46,  wrote:
>
> > From: James Bensley 
> > Sent: Thursday, June 27, 2019 9:56 AM
> >
> > One experience I have made is that when there is an outage on a large PE,
> > even when it still has spare capacity, is that the business impact can be 
> > too
> > much to handle (the support desk is overwhelmed, customers become irate
> > if you can't quickly tell them what all the impacted services are, when 
> > service
> > will be restored, the NMS has so many alarms it’s not clear what the problem
> > is or where it's coming from etc.).
> >
> I see what you mean, my hope is to address these challenges by having a 
> "single source of truth" provisioning system that will have, among other 
> things, also HW-customer/service mapping -so Ops team will be able to say 
> that if particular LC X fails then customers/services X,Y,Z will be affected.
> But yes I agree with smaller PEs any failure fallout is minimized 
> proportionally.

Hi Adam,

My experience is that it is much more complex than that (although it
also depends on what sort of service you're offering), one can't
easily model the inter-dependency between multiple physical assets
like links, interfaces, line cards, racks, DCs etc and logical
services such as a VRFs/L3VPNs, cloud hosted proxies and the P edge.

Consider this, in my opinion, relatively simple example:
Three PEs in a triangle. Customer is dual-homed to PE1 and PE2 and
their link to PE1 is their primary/active link. Transit is dual-homed
to PE2 and PE3 and your hosted filtering service cluster is also
dual-homed to PE2 and PE3 to be near the Internet connectivity.

How will you record the inter-dependencies that an outage on PE3
impacts Customer? Because when that Customer sends traffic to PE1
(lets say all their operations are hosted in a public cloud provider),
and PE1  has learned the shortest-path to 0/0 or ::0/0 from PE2, the
Internet traffic is sent from PE1 to PE2, and from PE2 into your
filtering cluster, and when the traffic comes back into PE2 after
passing through the filters it is then sent to PE3 because the transit
provider attached to PE3 has a better route to Customer's destination
(AWS/Azure/GCP/whatever) than the one directly attached to PE2.

That to me is a simple scenario, and it can be mapped with a
dependency tree. But in my experience, and maybe it's just me, things
are usually a lot more complicated than this. The root cause is
probably a bad design introducing too much complexity, which is
another vote for smaller PEs from me. With more service dedicated PEs
one can reduce or remove the possibility of piling multiple services
and more complexity onto the same PE(s).

Most places I've seen (managed service providers) simply can't map the
complex inter-dependencies they have been physical and logical
infrastructure without having some super bespoke and also complex
asset management / CMDB / CI system.

Cheers,
James.


Re: few big monolithic PEs vs many small PEs

2019-06-27 Thread James Bensley
On Wed, 19 Jun 2019 at 21:23,  wrote:
>
> Hi folks,
>
> Recently I ran into a peculiar situation where we had to cap couple of PE
> even though merely a half of the rather big chassis was populated with
> cards, reason being that the central RE/RP was not able to cope with the
> combined number of routes/vrfs/bgp sessions/etc..
>
> So this made me think about the best strategy in building out SP-Edge
> nowadays (yes I'm aware of the centralize/decentralize pendulum swinging
> every couple of years).
> The conclusion I came to was that *currently the best approach would be to
> use several medium to small(fixed) PEs to replace a big monolithic chasses
> based system.
> So what I was thinking is,
> Yes it will cost a bit more (router is more expensive than a LC)
> Will end up with more prefixes in IGP, more BGP sessions etc.. -don't care.
> But the benefits are less eggs in one basket, simplified and hence faster
> testing in case of specialized PEs and obviously better RP CPU/MEM to port
> ratio.
> Am I missing anything please?
>
> *currently,
> Yes some old chassis systems or even multi-chassis systems used to support
> additional RPs and offloading some of the processes (e.g. BGP onto those)
> -problem is these are custom hacks and still a single OS which needs
> rebooting LC/ASICs when being upgraded -so the problem of too many eggs in
> one basket still exists (yes cisco NCS6k and recent ASR9k lightspeed LCs are
> an exception)
> And yes there is the "node-slicing" approach from Juniper where one can
> offload CP onto multiple x86 servers and assign LCs to each server (virtual
> node) - which would solve my chassis full problem -but honestly how many of
> you are running such setup? Exactly. And that's why I'd be hesitant to
> deploy this solution in production just yet. I don't know of any other
> vendor solution like this one, but who knows maybe in 5 years this is going
> to be the new standard. Anyways I need a solution/strategy for the next 3-5
> years.
>
>
> Would like to hear what are your thoughts on this conundrum.
>
> adam
>
> netconsultings.com
> ::carrier-class solutions for the telecommunications industry::

Hi Adam,

Over the years I have been bitten multiple times by having fewer big
routers with either far too many services/customers connected to them
or too much traffic going through them. These days I always go for
more smaller/more routers than fewer/larger routers.

One experience I have made is that when there is an outage on a large
PE, even when it still has spare capacity, is that the business impact
can be too much to handle (the support desk is overwhelmed, customers
become irate if you can't quickly tell them what all the impacted
services are, when service will be restored, the NMS has so many
alarms it’s not clear what the problem is or where it's coming from
etc.).

I’ve seen networks place change freeze on devices, with the exception
of changes that migrate customers or services off of the PE, because
any outage would create too great an impact to the business, or risk
the customers terminating their contract. I’ve also seen changes
freeze be placed upon large PEs because the complexity was too great,
trying to work out the impact of a change on one of the original PEs
from when the network was first built, which is somehow linked to
virtually every service on the network in some obscure and
unforeseeable way.

This doesn’t mean there isn’t a place for large routers. For example,
in a typical network, by the time we get to the P nodes layer in the
core we tend to have high levels of redundancy, i.e. any PE is
dual-homed to two or more P nodes and will have 100% redundant
capacity. Down at the access layer customers may be connected to a
single access layer device or the access layer device might have a
single backhaul link. So technically we have lots of customers,
services and traffic passing through larger P node devices, but these
devices have a low rate of changes / low touch, perform a low number
of functions, they are operationally simple, and are highly redundant.
Adversely at the service edge, which I guess is your main concern
here, I’m all about more smaller devices with single service dedicated
devices.

I’ve tried to write some of my experiences here
(https://null.53bits.co.uk/index.php?page=few-larger-routers-vs.-many-smaller-routers).
The tl;dr version though is that there’s rarely a technical
restriction to having fewer large routers and it’s an
operational/business impact problem.

I'd like to hear from anyone who has had great success with fewer larger PEs.

Cheers,
James.


RE: few big monolithic PEs vs many small PEs

2019-06-27 Thread adamv0025
> From: Mark Tinka 
> Sent: Friday, June 21, 2019 1:27 PM
> 
> 
> 
> On 21/Jun/19 10:32, adamv0...@netconsultings.com wrote:
> 
> > So this particular case, the major POPs, is actually where we ran into the
> problem of RE/RP becoming full (too many VRFs/Routes/BGP sessions)
> halfway through the chassis.
> > Hence I'm considering whether it's actually better to go with multiple small
> chassis and/or fixed form PEs in the rack as opposed to half/full rack 
> chassis.
> 
> Are you saying that even the fastest and biggest control plane on the market
> for your chassis is unable to support your requirements (assuming their cost
> did not stop you from looking at them in the first place)?
> 
I believe it would, for a time, but it would require SW upgrade -testing etc.. 
even newer SW in itself gave us better resource management and performance 
optimizations.
However even with powerful CP and streamlined SW we'd be still just buying time 
while pushing the envelope. 
Hence the decentralization at the edge seems like a natural strategy to exit 
the uroboros paradigm.

adam  



RE: few big monolithic PEs vs many small PEs

2019-06-27 Thread adamv0025
> From: James Bensley 
> Sent: Thursday, June 27, 2019 9:56 AM
> 
> One experience I have made is that when there is an outage on a large PE,
> even when it still has spare capacity, is that the business impact can be too
> much to handle (the support desk is overwhelmed, customers become irate
> if you can't quickly tell them what all the impacted services are, when 
> service
> will be restored, the NMS has so many alarms it’s not clear what the problem
> is or where it's coming from etc.).
> 
I see what you mean, my hope is to address these challenges by having a "single 
source of truth" provisioning system that will have, among other things, also 
HW-customer/service mapping -so Ops team will be able to say that if particular 
LC X fails then customers/services X,Y,Z will be affected. 
But yes I agree with smaller PEs any failure fallout is minimized 
proportionally.
 
> 
> This doesn’t mean there isn’t a place for large routers. For example, in a
> typical network, by the time we get to the P nodes layer in the core we tend
> to have high levels of redundancy, i.e. any PE is dual-homed to two or more P
> nodes and will have 100% redundant capacity. 
Exactly, while the service edge topology might be dynamic as a result of 
horizontal scaling the core on the other hand I'd say should be fairly static 
and scaled vertically, that is I wouldn't want to scale core routers 
horizontally and as a result have core topology changing with every P scale out 
iteration at any POP, that would be bad news for capacity planning and traffic 
engineering... 

> 
> I’ve tried to write some of my experiences here
> (https://null.53bits.co.uk/index.php?page=few-larger-routers-vs.-many-
> smaller-routers).
> The tl;dr version though is that there’s rarely a technical restriction to 
> having
> fewer large routers and it’s an operational/business impact problem.
> 
I'll give it a read, cheers.

adam



Re: few big monolithic PEs vs many small PEs

2019-06-21 Thread Bryan Holloway

On 6/21/19 10:01 AM, Aaron Gould wrote:

I was reading this and thought, planet earth is a single point of failure.

...but, I guess we build and design and connect as much redundancy (logic, hw, 
sw, power) as the customer requires and pays for and that we can truly 
accomplish.

-Aaron




I don't know about you, but we keep two earths in active/standby. Sure, 
the power requirements are through the roof, but hey -- it's worth it.


Re: few big monolithic PEs vs many small PEs

2019-06-21 Thread Anderson, Charles R
On Fri, Jun 21, 2019 at 09:01:38AM -0500, Aaron Gould wrote:
> I was reading this and thought, planet earth is a single point of failure.
> 
> ...but, I guess we build and design and connect as much redundancy (logic, 
> hw, sw, power) as the customer requires and pays for and that we can 
> truly accomplish.

Fate sharing is also an important concept in system design.


RE: few big monolithic PEs vs many small PEs

2019-06-21 Thread Aaron Gould
I was reading this and thought, planet earth is a single point of failure.

...but, I guess we build and design and connect as much redundancy (logic, hw, 
sw, power) as the customer requires and pays for and that we can truly 
accomplish.

-Aaron





Re: few big monolithic PEs vs many small PEs

2019-06-21 Thread Mike Hammett
" It is not economical or even physically possible to have an MPLS device next 
to every DSLAM, hence the aggregation." 


https://mikrotik.com/product/RB750r2 MSRP $39.95 


I readily admit that this device isn't large enough for most cases, but you can 
get cheap and small MPLS routers. 




- 
Mike Hammett 
Intelligent Computing Solutions 

Midwest Internet Exchange 

The Brothers WISP 

- Original Message -

From: "Tarko Tikan"  
To: adamv0...@netconsultings.com, nanog@nanog.org 
Sent: Friday, June 21, 2019 2:51:20 AM 
Subject: Re: few big monolithic PEs vs many small PEs 

hey, 

> So what is the primary goal of us using the aggregation/access layer? It's to 
> achieve better utilization of the expensive router ports right? (hence called 
> aggregation) 

I'm in the eyeball business so saving router ports is not a primary concern. 

Aggregation exists to aggregate downstream access devices like DSLAMs, 
OLTs etc. First of all they have interfaces that are not available in 
your typical PEs. Secondly they are physically located further 
downstream, closer to the customers. It is not economical or even 
physically possible to have an MPLS device next to every DSLAM, hence 
the aggregation. 

Eyeball network topologies are very much driven by fiber layout that 
might have been built 10+ years ago following TDM network best practices 
(rings). 

Ideally (and if your market situation and finances allow this) you want 
your access device (or in PON case, perhaps even a OLT linecard) to be 
only SPOF. If you now uplink this access device to a PE, PE linecard 
becomes a SPOF for many, let's say 40 as this is a typical port count, 
access devices. 

If you don't want this to happen you can use second fiber pair for 
second uplink but you typically don't have fiber to second aggregation 
site. So your only option is to build on same fiber (so thats a SPOF 
too) to the same site. If you now uplink to same PE, you will still 
loose both uplinks during software upgrades. 

Two devices will help with that making aggregation upgrades invisible 
for customers thus improving customer satisfaction. Again, it very much 
depends on market, in here the customers get nosy if they have more than 
one or two planned maintenances in a year (and this is not for some 
premium L3VPN service but just internet). 

-- 
tarko 



Re: few big monolithic PEs vs many small PEs

2019-06-21 Thread Mark Tinka



On 21/Jun/19 10:32, adamv0...@netconsultings.com wrote:

> Well yes but if say I compare just a single line-card cost to a standalone 
> fixed-format 1RU router with a similar capacity -the card will always be 
> cheaper and then as I'll start adding cards on the left-hand side of the 
> equation things should start to even out gradually (problem is this gradual 
> increase is just a theoretical exercise -there are no fixed PE products to do 
> this with).
> Yes I can compare mpc7 with a mx204. Or asr9901 with some tomahawk card(s) 
> probably not apples to apples? 

Yes, you can't always do that because not many vendors create 1U router
versions of their line cards. The MX204 is probably one of those that
comes reasonably close.

I'm not sure deciding whether you get an MPC7 line card or an MX204 will
be a meaningful exercise. You need to determine what your use-case fits.
For example, rather than buy MPC7 line cards to support 100Gbps
customers in our MX480's, it is easier to buy an MX10003. That way, we
can keep the MPC2 line cards in the MX480 chassis to support up to N x
10Gbps of customer links (aggregated to an Ethernet switch, of course)
and not pay the cost of trying to run 100Gbps services through the MX480.

The MX10003 would then be dedicated for 100Gbps customers (and 40Gbps),
meaning we can manage the ongoing operational costs of each type of
customer for a specific box.

We have thought about using MX204's to support 40Gbps and 100Gbps
customers, but there aren't enough ports on it for it to make sense,
particularly given those types of customers will want the routers they
connect to to have some kind of physical redundancy, which the MX204
does not have.

Our use-case for the MX204 is:

    - Peering.
    - Metro-E deployments for customers needing 10Gbps in the Access.


> Also one interesting CAPEX factor to consider is the connectivity back to the 
> core, as with many small PEs in a POP one would need a lot of ports on core 
> routers and also once again the aggregation factor is somewhat lost in doing 
> so. Where I'd have just a couple of PEs with 100G back to the core now I'd 
> need bunch of 10s-bundled or 40s -would probably need additional cards in 
> core routers to accommodate the need for PE ports in the POP.   

Yes, that's not a small issue to scoff at, and you raise a valid concern
that could be easily overlooked if you adopted several smaller edge
routers in the data centre in favour of fewer large ones.

That said, you could do what we do and have a Layer 2 core switching
network, where you aggregate all routers in the data centre, so that you
are not running point-to-point links between routers and your core
boxes. For us, because of this, we still have plenty of slots left in
our CRS-8 chassis 5 years after deploying them, even though we are
supporting several 100's of Gbps worth of downstream router capacity.


> Well playing devil's advocate, having the metro rings build as dumb L1 or L2 
> with pair of PEs at the top is cheaper -although not much cheaper nowadays 
> the economics in this sector changed significantly over the past years. 

A dumb Metro-E access with all the smarts in the core is cheap to build,
but expensive to operate.

You can't run away from the costs. You just have to decide whether you
want to pay costs in initial cash or in long-term operational headache.

> So this particular case, the major POPs, is actually where we ran into the 
> problem of RE/RP becoming full (too many VRFs/Routes/BGP sessions) halfway 
> through the chassis.
> Hence I'm considering whether it's actually better to go with multiple small 
> chassis and/or fixed form PEs in the rack as opposed to half/full rack 
> chassis. 

Are you saying that even the fastest and biggest control plane on the
market for your chassis is unable to support your requirements (assuming
their cost did not stop you from looking at them in the first place)?

Mark.



Re: few big monolithic PEs vs many small PEs

2019-06-21 Thread Mark Tinka



On 21/Jun/19 10:46, adamv0...@netconsultings.com wrote:

> I'd actually like to hear more on that if you don't mind.

What part, Juniper's Ethernet switching portfolio?


> You actually haven't answered the question I'm afraid :)
> So would you connect the Juniper now Arista aggregation switch to at least 
> two PEs in the POP (or all PEs in the POP -"fabric-style") or would you 
> consider 1:1 mapping between an aggregation switch and a PE please?

Each edge router connects to its own aggregation switch (one or more,
depending on the number of ports required). The outgoing EX4550's we
used were setup in a VC for ease of management when we needed more ports
on a router-switch pair. But since Arista don't support VC's, each
switch would have an independent port to the edge router. Based upon
experience with VC's and the EX4550, that's not necessarily a bad thing,
as what you provision and what you actually get and can use are totally
different things.

We do not dual-home aggregation switches to edge routers; that's just
asking for STP issues (which we once faced when we thought we should be
fancy and provide VRRP services between 2 edge routers and their
associated aggregated switches.

Mark.



RE: few big monolithic PEs vs many small PEs

2019-06-21 Thread adamv0025
> From: Mark Tinka
> Sent: Friday, June 21, 2019 9:07 AM
> 
> 
> 
> On 21/Jun/19 09:36, adamv0...@netconsultings.com wrote:
> 
> > And indeed there are cases where we connect customers directly on to
> > the PEs, but then it's somehow ok for a line-card to be part of just a
> > single chassis (or a PE).
> 
> We'd typically do this for very high-speed ports (100Gbps), as it's cheaper to
> aggregate 10Gbps-and-slower via an Ethernet switch trunking to a router line
> card.
> 
> 
> > Now let's take a step even further what if the line-card is not inside the
> chassis anymore -cause it's a fabric-extender or a satellite card.
> > Why all of a sudden we'd be uncomfortable again to have it part of just a
> single chassis (and there are tons of satellite/extender topologies to prove
> that this is a real concern among operators).
> 
> I never quite saw the use-case for satellite ports. To me, it felt like 
> vendors
> trying to find ways to lock you into their revenue stream forever, as many of
> these architectures do not play well with the other kids. I'd rather keep it
> simple and have 802.1Q trunks between router line cards and affordable
> Ethernet switches.
> 
> We are currently switching our Layer 2 aggregation ports in the data centre
> from Juniper to Arista, talking to a Juniper edge router. I'd have been in 
> real
> trouble if I'd fallen for Juniper's satellite system, as they have a number of
> shortfalls in the Layer 2 space, I feel.
> 
I'd actually like to hear more on that if you don't mind.

> 
> > So to circle back to a standalone aggregation device -should we try and
> complicate the design by creating this "fabric" (PEs "spine" and aggregation
> devices "leaf") in an attempt to increase resiliency or shall we treat each
> aggregation device as unitary indivisible part of a single PE as if it was a 
> card in
> a chassis -cause if the economics worked It would be a card in a chassis?
> 
> See my previous response to you.
> 
You actually haven't answered the question I'm afraid :)
So would you connect the Juniper now Arista aggregation switch to at least two 
PEs in the POP (or all PEs in the POP -"fabric-style") or would you consider 
1:1 mapping between an aggregation switch and a PE please?

adam 





RE: few big monolithic PEs vs many small PEs

2019-06-21 Thread adamv0025
Hey Mark,
> From: Mark Tinka
> Sent: Thursday, June 20, 2019 3:27 PM
> 
> On 19/Jun/19 22:22, adamv0...@netconsultings.com wrote:
> 
> > Yes it will cost a bit more (router is more expensive than a LC)
> 
> I found the reverse to be true... chassis' are cheap. Line cards are costly.
> 
Well yes but if say I compare just a single line-card cost to a standalone 
fixed-format 1RU router with a similar capacity -the card will always be 
cheaper and then as I'll start adding cards on the left-hand side of the 
equation things should start to even out gradually (problem is this gradual 
increase is just a theoretical exercise -there are no fixed PE products to do 
this with).
Yes I can compare mpc7 with a mx204. Or asr9901 with some tomahawk card(s) 
probably not apples to apples? 
But if I would venture above 1/2RU then I'm back in chassis based systems 
paying extra for REs/RPs and fabric and fans and PSUs... with every small PE 
I'm putting in so then I'm talking about add two new cards to existing chassis 
or ad two new cards to a new chassis. 

Also one interesting CAPEX factor to consider is the connectivity back to the 
core, as with many small PEs in a POP one would need a lot of ports on core 
routers and also once again the aggregation factor is somewhat lost in doing 
so. Where I'd have just a couple of PEs with 100G back to the core now I'd need 
bunch of 10s-bundled or 40s -would probably need additional cards in core 
routers to accommodate the need for PE ports in the POP.   

> 
> >
> > Would like to hear what are your thoughts on this conundrum.
> 
> So this depends on where you want to deliver your service, and the function,
> in my opinion.
> 
> If you are talking about an IP/MPLS-enabled Metro-E network, then having
> several, smaller routers spread across one or more rings is cheaper and more
> effective.
> 
Well playing devil's advocate, having the metro rings build as dumb L1 or L2 
with pair of PEs at the top is cheaper -although not much cheaper nowadays the 
economics in this sector changed significantly over the past years. 


> If you are delivering services to large customers from within a data centre,
> large edge routers make more sense, particularly given the rising costs of co-
> location.
> 
So this particular case, the major POPs, is actually where we ran into the 
problem of RE/RP becoming full (too many VRFs/Routes/BGP sessions) halfway 
through the chassis.
Hence I'm considering whether it's actually better to go with multiple small 
chassis and/or fixed form PEs in the rack as opposed to half/full rack chassis. 
 

adam




Re: few big monolithic PEs vs many small PEs

2019-06-21 Thread Mark Tinka



On 21/Jun/19 09:36, adamv0...@netconsultings.com wrote:

> And indeed there are cases where we connect customers directly on to
> the PEs, but then it's somehow ok for a line-card to be part of just a
> single chassis (or a PE). 

We'd typically do this for very high-speed ports (100Gbps), as it's
cheaper to aggregate 10Gbps-and-slower via an Ethernet switch trunking
to a router line card.


> Now let's take a step even further what if the line-card is not inside the 
> chassis anymore -cause it's a fabric-extender or a satellite card.
> Why all of a sudden we'd be uncomfortable again to have it part of just a 
> single chassis (and there are tons of satellite/extender topologies to prove 
> that this is a real concern among operators).

I never quite saw the use-case for satellite ports. To me, it felt like
vendors trying to find ways to lock you into their revenue stream
forever, as many of these architectures do not play well with the other
kids. I'd rather keep it simple and have 802.1Q trunks between router
line cards and affordable Ethernet switches.

We are currently switching our Layer 2 aggregation ports in the data
centre from Juniper to Arista, talking to a Juniper edge router. I'd
have been in real trouble if I'd fallen for Juniper's satellite system,
as they have a number of shortfalls in the Layer 2 space, I feel.


> So to circle back to a standalone aggregation device -should we try and 
> complicate the design by creating this "fabric" (PEs "spine" and aggregation 
> devices "leaf") in an attempt to increase resiliency or shall we treat each 
> aggregation device as unitary indivisible part of a single PE as if it was a 
> card in a chassis -cause if the economics worked It would be a card in a 
> chassis?

See my previous response to you.

Mark.



Re: few big monolithic PEs vs many small PEs

2019-06-21 Thread Tarko Tikan

hey,


So what is the primary goal of us using the aggregation/access layer? It's to 
achieve better utilization of the expensive router ports right? (hence called 
aggregation)


I'm in the eyeball business so saving router ports is not a primary concern.

Aggregation exists to aggregate downstream access devices like DSLAMs, 
OLTs etc. First of all they have interfaces that are not available in 
your typical PEs. Secondly they are physically located further 
downstream, closer to the customers. It is not economical or even 
physically possible to have an MPLS device next to every DSLAM, hence 
the aggregation.


Eyeball network topologies are very much driven by fiber layout that 
might have been built 10+ years ago following TDM network best practices 
(rings).


Ideally (and if your market situation and finances allow this) you want 
your access device (or in PON case, perhaps even a OLT linecard) to be 
only SPOF. If you now uplink this access device to a PE, PE linecard 
becomes a SPOF for many, let's say 40 as this is a typical port count, 
access devices.


If you don't want this to happen you can use second fiber pair for 
second uplink but you typically don't have fiber to second aggregation 
site. So your only option is to build on same fiber (so thats a SPOF 
too) to the same site. If you now uplink to same PE, you will still 
loose both uplinks during software upgrades.


Two devices will help with that making aggregation upgrades invisible 
for customers thus improving customer satisfaction. Again, it very much 
depends on market, in here the customers get nosy if they have more than 
one or two planned maintenances in a year (and this is not for some 
premium L3VPN service but just internet).


--
tarko


RE: few big monolithic PEs vs many small PEs

2019-06-21 Thread adamv0025
Hey,

> From: Tarko Tikan
> Sent: Thursday, June 20, 2019 8:28 AM
> 
> hey,
> 
> > For availability I think it is best approach to do many small edge
> > devices.
> 
> This is also great for planned maintenance. ISSU has not really worked out for
> any of the vendors and with two small devices you can upgrade them
> independently.
>
Yup I guess no one is really using ISSU in production, and even with ISSU, 
currently, most of the NPUs on the market need to be power-cycled to load a new 
version of microcode so there's packet loss on data-plane anyways. 
 
> Great for aggregation, enables you to dual-home access devices into two
> separate PEs that will never be down at the same time be it failure or
> planned maintenance (excluding the physical issues like power/cooling but
> dual-homing to two separate sites is always problematic for eyeball
> networks).
> 
Actually this is an interesting point you just raised.
(note: The assumption for the below is single-homed customers, as the 
dual-homed customer would probably what to be at least site diverse and pay 
premium for that service)
So what is the primary goal of us using the aggregation/access layer? It's to 
achieve better utilization of the expensive router ports right? (hence called 
aggregation)
And indeed there are cases where we connect customers directly on to the PEs, 
but then it's somehow ok for a line-card to be part of just a single chassis 
(or a PE).
Now let's take a step even further what if the line-card is not inside the 
chassis anymore -cause it's a fabric-extender or a satellite card.
Why all of a sudden we'd be uncomfortable again to have it part of just a 
single chassis (and there are tons of satellite/extender topologies to prove 
that this is a real concern among operators).
So to circle back to a standalone aggregation device -should we try and 
complicate the design by creating this "fabric" (PEs "spine" and aggregation 
devices "leaf") in an attempt to increase resiliency or shall we treat each 
aggregation device as unitary indivisible part of a single PE as if it was a 
card in a chassis -cause if the economics worked It would be a card in a 
chassis?

adam



Re: few big monolithic PEs vs many small PEs

2019-06-21 Thread Saku Ytti
On Fri, 21 Jun 2019 at 10:09,  wrote:

> Just on the human cockups though, we're putting more and more automation in 
> to help address the problem of human imperfections.

With automation we break far far less often, far far more. MTTR is
also increased due to skill rot, in CLI jockey network you break
something every day and you have to troubleshoot and fix it, so even
fixing complex problems becomes routine. With automation years may
pass without complex outages when they happen, people panic and are
able to act logically and focus on single problem.

I am absolutely PRO automation. But I'm saying there is a cost.

-- 
  ++ytti


RE: few big monolithic PEs vs many small PEs

2019-06-21 Thread adamv0025
Hey Saku,

> From: Saku Ytti 
> Sent: Thursday, June 20, 2019 7:04 AM
> 
> On Wed, 19 Jun 2019 at 23:25,  wrote:
> 
> > The conclusion I came to was that *currently the best approach would
> > be to use several medium to small(fixed) PEs to replace a big
> > monolithic chasses based system.
> 
> For availability I think it is best approach to do many small edge devices.
> Because software is terrible, will always be terrible. People are bad at
> operating the devices and will always be. Hardware is is something we think
> about lot when we think about redundancy, but it's not that common reason
> for an outage.
> With more smaller boxes the inevitable human cockup and software defects
> will affect fewer customers. Why I believe this to be true, is because the
> events are sufficiently rare and once those happen, we find solution or at
> very least workaround rather fast. With full inaction you could argue that
> having A3 and B1+B2 is same amount of aggregate outage, as while outage in
> B affects fewer customers, there are two B nodes with equal probability of
> outage. But I argue that the events are not independent, they are
> dependent, so probability calculation isn't straightforward. Once we get
> some rare software defect or operator mistake on  B1, we usually solve it
> before it triggers on B2, making the aggregate downtime of entire system
> lower.
>
Yup I agree, 
Just on the human cockups though, we're putting more and more automation in to 
help address the problem of human imperfections.
But automation can actually go both ways, some say it helps with the small day 
to day problems but occasionally creates a massive one.
So considering the B1 & B2 correlation if operations on these are automated 
then, depending on how the automation system is designed/operated, one might 
not get the chance to reflect/assess on B1 before B2 is touched -so this might 
further complicate the equation for the aggregate system downtime computation.
  
 
> > Yes it will cost a bit more (router is more expensive than a LC)
> 
> Several of my employees have paid only for LC. I don't think the CAPEX
> difference is meaningful, but operating two separate devices may have
> significant OPEX implications in electricity, rack space, provisioning,
> maintenance etc.
> 
> > And yes there is the "node-slicing" approach from Juniper where one
> > can offload CP onto multiple x86 servers and assign LCs to each server
> > (virtual
> > node) - which would solve my chassis full problem -but honestly how
> > many of you are running such setup? Exactly. And that's why I'd be
> > hesitant to deploy this solution in production just yet. I don't know
> > of any other vendor solution like this one, but who knows maybe in 5
> > years this is going to be the new standard. Anyways I need a
> > solution/strategy for the next 3-5 years.
> 
> Node slicing indeed seems like it can be sufficient compromise here between
> OPEX and availability. I believe (not know) that the shared software risks are
> meaningfully reduced and that bringing down whole system is sufficiently
> rare to allow availability upside compared to single large box.
> 
I tend to agree, though as you say it's a compromise nevertheless.
If one needs to switch to a new version of fabric in order to support new 
line-cards or upgrade code on the base system for that matter - the whole thing 
(NFVI) needs to be power-cycled. 

adam 




Re: few big monolithic PEs vs many small PEs

2019-06-20 Thread Mark Tinka



On 19/Jun/19 22:22, adamv0...@netconsultings.com wrote:

> Yes it will cost a bit more (router is more expensive than a LC)

I found the reverse to be true... chassis' are cheap. Line cards are costly.


>  
> Would like to hear what are your thoughts on this conundrum.

So this depends on where you want to deliver your service, and the
function, in my opinion.

If you are talking about an IP/MPLS-enabled Metro-E network, then having
several, smaller routers spread across one or more rings is cheaper and
more effective.

If you are delivering services to large customers from within a data
centre, large edge routers make more sense, particularly given the
rising costs of co-location.

If you are providing BNG services, it depends on how you want to balance
ease of management vs. scale vs. cost. If you have the cash to spend,
de-centralizing your BNG's across a region/city/country will give you
more scale and better redundancy, but could be more costly depending on
your per-box sizing as well as an increase in management time. If you
want to improve management, you can have fewer boxes to cover large
parts of your region/city/country. But this may mean buying a very large
box to concentrate more users in fewer places.

If you are trying to combine Enterprise, Service Provider and Consumer
services in one chassis, well, as the saying goes, "If you are
competitor, I approve of this message" :-).

Mark.



Re: few big monolithic PEs vs many small PEs

2019-06-20 Thread Tarko Tikan

hey,


For availability I think it is best approach to do many small edge
devices.


This is also great for planned maintenance. ISSU has not really worked 
out for any of the vendors and with two small devices you can upgrade 
them independently.


Great for aggregation, enables you to dual-home access devices into two 
separate PEs that will never be down at the same time be it failure or 
planned maintenance (excluding the physical issues like power/cooling 
but dual-homing to two separate sites is always problematic for eyeball 
networks).


--
tarko


Re: few big monolithic PEs vs many small PEs

2019-06-20 Thread Saku Ytti
On Wed, 19 Jun 2019 at 23:25,  wrote:

> The conclusion I came to was that *currently the best approach would be to
> use several medium to small(fixed) PEs to replace a big monolithic chasses
> based system.

For availability I think it is best approach to do many small edge
devices. Because software is terrible, will always be terrible. People
are bad at operating the devices and will always be. Hardware is is
something we think about lot when we think about redundancy, but it's
not that common reason for an outage.
With more smaller boxes the inevitable human cockup and software
defects will affect fewer customers. Why I believe this to be true, is
because the events are sufficiently rare and once those happen, we
find solution or at very least workaround rather fast. With full
inaction you could argue that having A3 and B1+B2 is same amount of
aggregate outage, as while outage in B affects fewer customers, there
are two B nodes with equal probability of outage. But I argue that the
events are not independent, they are dependent, so probability
calculation isn't straightforward. Once we get some rare software
defect or operator mistake on  B1, we usually solve it before it
triggers on B2, making the aggregate downtime of entire system lower.

> Yes it will cost a bit more (router is more expensive than a LC)

Several of my employees have paid only for LC. I don't think the CAPEX
difference is meaningful, but operating two separate devices may have
significant OPEX implications in electricity, rack space,
provisioning, maintenance etc.

> And yes there is the "node-slicing" approach from Juniper where one can
> offload CP onto multiple x86 servers and assign LCs to each server (virtual
> node) - which would solve my chassis full problem -but honestly how many of
> you are running such setup? Exactly. And that's why I'd be hesitant to
> deploy this solution in production just yet. I don't know of any other
> vendor solution like this one, but who knows maybe in 5 years this is going
> to be the new standard. Anyways I need a solution/strategy for the next 3-5
> years.

Node slicing indeed seems like it can be sufficient compromise here
between OPEX and availability. I believe (not know) that the shared
software risks are meaningfully reduced and that bringing down whole
system is sufficiently rare to allow availability upside compared to
single large box.


-- 
  ++ytti


Re: few big monolithic PEs vs many small PEs

2019-06-19 Thread i3D.net - Martijn Schmidt via NANOG
Hi Adam,

Depends on how big of a router you need for your "small PE".

Taking Juniper as an example, the MX204 is pretty unbeatable cost wise if you 
can make do with its 4*QSFP28 & 8*SFP+ interfaces. There's a very big gap 
between the MX204 and the first chassis based router in the MX lineup, even if 
you only try to replicate the port configuration at first.

Best regards,
Martijn

PS, take note of the MX204 port profiles, not every combination of interface 
speeds is possible: https://apps.juniper.net/home/port-checker/

On 19 June 2019 22:22:45 CEST, adamv0...@netconsultings.com wrote:

Hi folks,

Recently I ran into a peculiar situation where we had to cap couple of PE
even though merely a half of the rather big chassis was populated with
cards, reason being that the central RE/RP was not able to cope with the
combined number of routes/vrfs/bgp sessions/etc..

So this made me think about the best strategy in building out SP-Edge
nowadays (yes I'm aware of the centralize/decentralize pendulum swinging
every couple of years).
The conclusion I came to was that *currently the best approach would be to
use several medium to small(fixed) PEs to replace a big monolithic chasses
based system.
So what I was thinking is,
Yes it will cost a bit more (router is more expensive than a LC)
Will end up with more prefixes in IGP, more BGP sessions etc.. -don't care.
But the benefits are less eggs in one basket, simplified and hence faster
testing in case of specialized PEs and obviously better RP CPU/MEM to port
ratio.
Am I missing anything please?

*currently,
Yes some old chassis systems or even multi-chassis systems used to support
additional RPs and offloading some of the processes (e.g. BGP onto those)
-problem is these are custom hacks and still a single OS which needs
rebooting LC/ASICs when being upgraded -so the problem of too many eggs in
one basket still exists (yes cisco NCS6k and recent ASR9k lightspeed LCs are
an exception)
And yes there is the "node-slicing" approach from Juniper where one can
offload CP onto multiple x86 servers and assign LCs to each server (virtual
node) - which would solve my chassis full problem -but honestly how many of
you are running such setup? Exactly. And that's why I'd be hesitant to
deploy this solution in production just yet. I don't know of any other
vendor solution like this one, but who knows maybe in 5 years this is going
to be the new standard. Anyways I need a solution/strategy for the next 3-5
years.


Would like to hear what are your thoughts on this conundrum.

adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::



--
Sent from my Android device with K-9 Mail. Please excuse my brevity.