Re: Level(3) filtering (was Yahoo outage summary)

2007-07-09 Thread Roland Dobbins



On Jul 9, 2007, at 8:10 PM, Chris L. Morrow wrote:


In the
number of customer conversations I've had about this it's always  
sort of
surprising that people think it's 'ok' to not have a prefix-list : 
( cause,
guess what: "if you don't have one and they don't have one... THEY  
will

get you eventually"


Many folks seem to think that they'll be OK because 'someone else'  
will be doing this for them, and so they're protected.  They also  
don't think about the fact that they themselves could accidentally  
cause a problem for others (and, in some cases, for themselves, by  
acting as an inadvertent sinkhole).  But when it's explained to them  
that a) if everyone thinks that 'someone else' will do the  
appropriate filtering, then nobody will do it, and b) that they can  
end up hosing themselves and also taking a big reputational hit, most  
people I talk to about this seem to understand.


The problem is that this is largely an ad-hoc, 1:1 type of  
educational effort, which doesn't scale well.  And in many cases,  
folks seem to find it difficult to go to their management and explain  
that they must invest the opex to implement and maintain these  
policies (along with BCP38, iACLs, et. al.); sort of an inversion of  
"The Emperor's New Clothes", heh.


---
Roland Dobbins <[EMAIL PROTECTED]> // 408.527.6376 voice

   Culture eats strategy for breakfast.

   -- Ford Motor Company





Re: Level(3) filtering (was Yahoo outage summary)

2007-07-09 Thread Chris L. Morrow



On Mon, 9 Jul 2007, Kevin Epperson wrote:

>
> There is some misinformation in previous posts that I would like to
> clarify on the Level 3 side of things.
>

and I'd apologize for hinting that that might be the problem :(

> Level 3's own registry and known public route registries.  As several
> folks have pointed out there are minimal checks for the validity of the
> source information.

this was what bit panix/edison I believe... :(

>
> As an aside I see an increase in the number of downstreams asking for
> as-path filtering or *no* filtering usually with justifications of ISP X
> doesn't require us to register routes or just does as-path filtering.  In
> my opinion that is bad news for everyone as documented in numerous
> BCPs, presentations and route-leaks.

agreed, there is this trend, it's disturbing :( (to me atleast) In the
number of customer conversations I've had about this it's always sort of
surprising that people think it's 'ok' to not have a prefix-list :( cause,
guess what: "if you don't have one and they don't have one... THEY will
get you eventually"


Level(3) filtering (was Yahoo outage summary)

2007-07-09 Thread Kevin Epperson


There is some misinformation in previous posts that I would like to 
clarify on the Level 3 side of things.


Every transit-like connection on AS3356 is prefix-filtered including all 
parties in this event.  On AS3356 all prefix filters and import policies 
on BGP sessions are audited and checked in almost realtime for people or 
system errors (missing, mis-referenced, not referenced, otherwise broken 
config, etc.)  The prefix filters themselves are generated using data from 
Level 3's own registry and known public route registries.  As several 
folks have pointed out there are minimal checks for the validity of the 
source information.


Further details on Level 3 filtering policies are available at:
   whois -h rr.level3.net AS3356 | grep remarks

As an aside I see an increase in the number of downstreams asking for 
as-path filtering or *no* filtering usually with justifications of ISP X 
doesn't require us to register routes or just does as-path filtering.  In 
my opinion that is bad news for everyone as documented in numerous 
BCPs, presentations and route-leaks.


-Kevin

Disclaimer - I do work for Level 3 but am expressing my opinions and not 
those of my employer.





RE: IP Allocations and moving AS numbers

2007-07-09 Thread Azinger, Marla

Shane-  Please redirect your email questions to ARIN ppml or discuss.  That 
will be a better forum for you with these type of questions.  I will also email 
you on the side.

Cheers!
Marla Azinger
Frontier Communications
AC Chair

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of
Shane Owens
Sent: Monday, July 09, 2007 2:35 PM
To: nanog@nanog.org
Subject: IP Allocations and moving AS numbers



All, I have been all but gone from IP management and BGP administration
tasks for over 2 years while teaching myself telecom as a CLEC.  I
recently had a past business acquaintance contact me that is currently
reselling bandwidth and using a 3rd parties network to do so.  He
currently has about 15 /24 address blocks through this 3rd party and
wants to move to his own AS number and away from theirs.

I know when I was last involved it seemed a pretty difficult process to
do this through ARIN.  Has the process changed at all recently? I am
going to help them get the AS number and get the process started, but
when asked if they could keep their existing IP address I explained that
the existing 3rd party would need to write a letter stating that they
are willing to transfer those IP's to your AS, ARIN would have to
approve it and it may be a bit of a hassle.  

They are currently running 7 data centers nationally and are willing to
migrate IP's, but would rather not if they can help it.

Does this sound about right?  I am going to go read the ARIN pages
tonight to see if I can answer this myself, but don't have time during
the workday to do a lot of research on this myself.  Figure someone here
probably knows already.

Shane Owens
DNA Communications
[EMAIL PROTECTED]
(w)815-562-4290 x-201
(c)815-793-3822 



IP Allocations and moving AS numbers

2007-07-09 Thread Shane Owens

All, I have been all but gone from IP management and BGP administration
tasks for over 2 years while teaching myself telecom as a CLEC.  I
recently had a past business acquaintance contact me that is currently
reselling bandwidth and using a 3rd parties network to do so.  He
currently has about 15 /24 address blocks through this 3rd party and
wants to move to his own AS number and away from theirs.

I know when I was last involved it seemed a pretty difficult process to
do this through ARIN.  Has the process changed at all recently? I am
going to help them get the AS number and get the process started, but
when asked if they could keep their existing IP address I explained that
the existing 3rd party would need to write a letter stating that they
are willing to transfer those IP's to your AS, ARIN would have to
approve it and it may be a bit of a hassle.  

They are currently running 7 data centers nationally and are willing to
migrate IP's, but would rather not if they can help it.

Does this sound about right?  I am going to go read the ARIN pages
tonight to see if I can answer this myself, but don't have time during
the workday to do a lot of research on this myself.  Figure someone here
probably knows already.

Shane Owens
DNA Communications
[EMAIL PROTECTED]
(w)815-562-4290 x-201
(c)815-793-3822 



Re: Yahoo outage summary

2007-07-09 Thread Tony Tauber


On Mon, Jul 09, 2007 at 04:50:56PM -0400, Joe Abley wrote:

> SIDR is only of any widespread use if it is coupled with policy/ 
> procedures at the RIRs to provide certificates for resources that are  
> assigned/allocated. However, this seems like less of a hurdle than  
> you'd think when you look at how many RIR staff are involved in  
> working on it.

> So, if you consider some future world where there are suitably  
> machine-readable repositories of number resources (e.g. IRRs) are  
> combined with machine-verifiable certificates affirming a customer's  
> right to use them, how far out of the woods are we? Or are we going  
> to find out that the real problem is some fundamental unwillingness  
> to automate this stuff, or something else?

Going to a model with reasonable and well-defined policies and
procedures is a good thing.  However, it renders all the existing IRR
information suspect.  Even the RRs run by RIRs are worthless as they
stand.  For instance ARIN runs an RR but does no validation of what goes
in there today.

A reasonable approach might be to pick up with tools based on the new
SIDR work and leave the existing IRR info behind.

Tony


Re: Yahoo outage summary

2007-07-09 Thread Jared Mauch

On Mon, Jul 09, 2007 at 04:50:56PM -0400, Joe Abley wrote:
> 
> 
>  On 9-Jul-2007, at 16:13, Jared Mauch wrote:
> 
> > Some have automated systems, but they're dependent on IRR data
> > being correct.  There are even tools to automate population of IRR data.
> 
>  Building customer filters from the IRR seems like it should fall in the 
>  "easy" bucket, given how long people have been doing it, and for how long. 
>  It's the lack of a way to trust the data that's published in the IRR that 
>  always seems to be the stumbling block.

-- snip --

>  So, if you consider some future world where there are suitably 
>  machine-readable repositories of number resources (e.g. IRRs) are combined 
>  with machine-verifiable certificates affirming a customer's right to use 
>  them, how far out of the woods are we? Or are we going to find out that the 
>  real problem is some fundamental unwillingness to automate this stuff, or 
>  something else?

It's that some folks feel entitled to announce routes without
registering them.  Take ANS vs Sprintlink as the classic example.  Not
much has changed since then.  Nor have the tools evolved significantly.

Some vendors still don't get router configuration from tools yet.
Try to automate something and it's not easy or impossible.  Even the
best solutions on the market have some problems when you feed it a 8+Meg
config.  It takes a lot of cpu time to process that much.

There really need to be some (ick, ignore that I suggested this)
Web 2.0 IRR tools.  Something that can smartly populate an IRR or
IRR-like dataset.  Something that can be taught to 'learn' what is
reasonable.  I've seen some cool things that show promise (eg: pretty
good bgp), but there's always some interesting drawback.

Plus, as Patrick said earlier, (and i generally agree), these
types of "attacks" are rare and usually short lived.  Even those
like the panix situation didn't last very long.  Perhaps it's not as
important to think about now.


- Jared

-- 
Jared Mauch  | pgp key available via finger from [EMAIL PROTECTED]
clue++;  | http://puck.nether.net/~jared/  My statements are only mine.


Re: Yahoo outage summary

2007-07-09 Thread Joe Abley



On 9-Jul-2007, at 16:13, Jared Mauch wrote:


Some have automated systems, but they're dependent on IRR data
being correct.  There are even tools to automate population of IRR  
data.


Building customer filters from the IRR seems like it should fall in  
the "easy" bucket, given how long people have been doing it, and for  
how long. It's the lack of a way to trust the data that's published  
in the IRR that always seems to be the stumbling block.


Various ops-aware people have been attacking the correctness issue in  
the SIDR working group. The work seems fairly well-cooked to me, and  
I seem to think that Geoff Huston has wrapped some proof-of-concept  
tools around the crypto.


SIDR is only of any widespread use if it is coupled with policy/ 
procedures at the RIRs to provide certificates for resources that are  
assigned/allocated. However, this seems like less of a hurdle than  
you'd think when you look at how many RIR staff are involved in  
working on it.


So, if you consider some future world where there are suitably  
machine-readable repositories of number resources (e.g. IRRs) are  
combined with machine-verifiable certificates affirming a customer's  
right to use them, how far out of the woods are we? Or are we going  
to find out that the real problem is some fundamental unwillingness  
to automate this stuff, or something else?



Joe


Re: Yahoo outage summary

2007-07-09 Thread Jared Mauch

On Mon, Jul 09, 2007 at 01:23:46PM -0500, Borchers, Mark M. wrote:
> Jared Mauch wrote:
> 
> > The simple truth is that prefix lists ARE hard to manage. 
> 
> Medium-hard IMHO.  Adding prefixes is relatively easy to implement.
> Tracking and removing outdated information significantly more challenging.
> 
> > Some people lack tools and automation to make it work or to manage their
> networks.
>  
> Best I can tell, even the largest transit providers handle prefix list
> updates manually.

Some have automated systems, but they're dependent on IRR data
being correct.  There are even tools to automate population of IRR data.

> At this stage of history, a human interface is probably necessary in making
> a reasonable
> assessment about the legitimacy of an update request.

I think here is one of the cruxes of the problem.  If it
requires a human, there's a few things that will happen:

1) prefix-list volume will be too much to be dealt with.
   I see some per-asn prefix lists that would be 255k routes and
   include all sorts of unreasonable junk like /32's

2) even taking a reasonable network, (in this case, i picked AS286)
   I see 4425 routes.  Either you check these all manually (at least
   once), or come up with some way to model it.  I currently see 250
   routes in the table with as-path _286_ from my view.  Either
   there's a lot of cruft there, or there's a lot of multihomed folks
   where i see a better path.  Which is it?  Do I have the time to
   crunch this myself?

3) What about those unique customer relationships?  (this is made up)
   Like where ATT buys transit from Cogent for those few prefixes
   in New Zealand they care about?  There's always some compelling
   business case to do something wonky.  Does this mean that ATT needs
   to register their prefixes in the cogent IRR?  How do you keep it
   'quiet' that this is happening, instead of an object saying
   'att priority customer route'?  How do you validate these?  Even
   the 'big guys' will make policy mistakes once in awhile.

There needs to be some 'better-way' IMHO, but my ideas on this
topic have not gotten far enough along for me to put code behind them.
Perhaps I'll need to reprioritize those efforts.  It seems to me like
someone could do a cool system that churns through the route-views data, or
if necessary just duplicate part of it by getting lots of bgp feeds and
trying to parse the data.

Too bad there's not a good way to do something like dampening on routes
where depending on the age of the announcement and some 'trust' factor you can
assign a series of local-preferences.  I'd really like to see something like
this exist.  ie: "dampen" the "new" path (even if the prefix is a longer
one) until some timer has ticked (unless some policy criteria are satisfied,
such as same as-path, etc..).

There's also the issue of how to implement this in the existing
router(s), some of them with slower cpus.  There's a lot of folks using
older hardware to to bgp that just might melt if they had to evaluate some
huge routing policy.

- Jared

-- 
Jared Mauch  | pgp key available via finger from [EMAIL PROTECTED]
clue++;  | http://puck.nether.net/~jared/  My statements are only mine.


Re: Yahoo outage summary

2007-07-09 Thread Patrick W. Gilmore


On Jul 9, 2007, at 11:19 AM, jared mauch wrote:

The simple truth is that prefix lists ARE hard to manage. There are  
a lot of folks that have complex relationships or don't see why  
they should register their routes. Some people lack tools and  
automation to make it work or to manage their networks. It would be  
nice to see everyone filter routes, including those from even  
transit and large peers. I don't think we will be able to ignore  
this forever. I also do not see the status quo changing soon either.


I'm not sure we can't ignore it forever.

The telephone network has been around for a lot longer than the 'Net,  
has way, way, way more connections, and there are corners of it which  
are managed even worse than the inter-web.


Like Sean said, cost/benefit.  If the cost of avoiding a 1 day outage  
per year is the same as a 5 day outage, management will not fix it.


--
TTFN,
patrick



Re: Yahoo outage summary

2007-07-09 Thread Douglas Otis



On Jul 9, 2007, at 9:31 AM, Randy Bush wrote:


Tony Tauber wrote:
There's no magic bullet in updating BGP if a fundamental,  
verifiable data model is not accepted and agreed upon.


the space of routing data validation is large, we can explore it at  
our leisure, and we have been for some years.  but my point was  
that it is silly to indulge in conjecturbation on the cause of the  
recent event and excoriate l(3), hanaro, or john curran's  
grandmother until we have heard from the folk who have actual data.


I can't help but conjecturbate how this might relate to route flap  
damping, and whether overly aggressive RFD might related to such  
DoS.  The other side of the coin would be that RFD might also limit  
the extent spoofed routes.  The amount of noise within the system  
makes it difficult for administrators to fully comprehending what  
happened while it is happening.  A means to even partially validate  
routing information might provide more timely and greater insight.   
This insight may help rule out nefarious causes.  When it doesn't,  
the issue might be far more serious.  Crying wolf too many times is  
bad, but not seeing the wolf could be worse.


-Doug


Re: Yahoo outage summary

2007-07-09 Thread Sean Donelan


On Tue, 10 Jul 2007, Randy Bush wrote:

the space of routing data validation is large, we can explore it at our
leisure, and we have been for some years.  but my point was that it is
silly to indulge in conjecturbation on the cause of the recent event and
excoriate l(3), hanaro, or john curran's grandmother until we have heard
from the folk who have actual data.


If companies thought it was in their self-interest, they might actually
share that actual data.  However history has shown over and over again
that companies generally avoid any public discussion about their problems
until they are overwhelmed.

http://tech.monstersandcritics.com/news/article_1327791.php/Yahoo_outage_caused_by_Level3_BGP_issue

If you wait for the companies to reveal the data, you will probably have
a long wait.

WorldCom still hasn't released its official investigative report into why 
its national frame networks failed for nearly a week in 1999.


Re: Yahoo outage summary

2007-07-09 Thread Randy Bush

Tony Tauber wrote:
> On Mon, Jul 09, 2007 at 02:31:10PM +0800, Randy Bush wrote:
>>> following existing BCPs with currently-deployed
>>> techniques/functionality/features would have prevented the issue
>>> described in the post.
>> knowing that level(3) is one of the most serious deployments of
>> irr-based route filters and other prudent practices, perhaps we should
>> wait for a post mortem from level(3) before jumping to conclusions?
> There's no magic bullet in updating BGP if a fundamental, verifiable
> data model is not accepted and agreed upon.

the space of routing data validation is large, we can explore it at our
leisure, and we have been for some years.  but my point was that it is
silly to indulge in conjecturbation on the cause of the recent event and
excoriate l(3), hanaro, or john curran's grandmother until we have heard
from the folk who have actual data.

randy


Re: Yahoo outage summary

2007-07-09 Thread Tony Tauber

On Mon, Jul 09, 2007 at 02:31:10PM +0800, Randy Bush wrote:
> 
> > following existing BCPs with currently-deployed
> > techniques/functionality/features would have prevented the issue
> > described in the post.
> 
> knowing that level(3) is one of the most serious deployments of
> irr-based route filters and other prudent practices, perhaps we should
> wait for a post mortem from level(3) before jumping to conclusions?
> 
> randy

Level3's filter implmentation is indeed well-done, however, the fact
remains that the IRR (which I use and endorse) has no linkage to any
other source of information for purposes of validation.
It's fundamentally garbage in, garbage out.

Say some ISP has a provisioning tool which updates their router
configs and the IRR in one fell swoop.  If the provisioner makes a typo
the IRR will gladly accept the entry for, say, 12/8, and the upstream
will rebuild their filters with that entry automatically and you get the
same result.

There's no magic bullet in updating BGP if a fundamental, verifiable
data model is not accepted and agreed upon.

Tony


Re: Yahoo outage summary

2007-07-09 Thread Florian Weimer

* Valdis Kletnieks:

> (Yes, I know the jury is still out on what really happened at L3-Hanaro.
> Doesn't change the fact that we collectively shoot ourselves in the foot
> because providers will believe the most implausible things from their
> neighbors, like announcements for 128/1 ;)

Well, if L3 creates its filters based on RADB entries (which is still
considered a RR, isn't it?), they will accept a 213/8
announcement. 8-(  128/1 isn't too far away, I fear.

-- 
Florian Weimer<[EMAIL PROTECTED]>
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99


Re: Yahoo outage summary

2007-07-09 Thread jared mauch




On Jul 9, 2007, at 10:47 AM, [EMAIL PROTECTED] wrote:


On Mon, 09 Jul 2007 02:18:25 -, "Chris L. Morrow" said:

While S*BGP seem like they may offer additional protections and  
additional

knobs to be used for protecting 'us' from 'them', the very basics are
obviously not being done so added complexity is not going to really  
help
:( Or, perhaps its not that its not going to help its just not  
going to

get done because even prefix-lists are 'too hard', apparently.


"Wow, prefix-lists are *hard*" -- BGP Barbie..

You'd think that by now, we as an industry could do better than that.


I agree that we need something better but nobody has shown me a better  
system than prefix lists and irr that actually *works*.


The simple truth is that prefix lists ARE hard to manage. There are a  
lot of folks that have complex relationships or don't see why they  
should register their routes. Some people lack tools and automation to  
make it work or to manage their networks. It would be nice to see  
everyone filter routes, including those from even transit and large  
peers. I don't think we will be able to ignore this forever. I also do  
not see the status quo changing soon either.


Re: Yahoo outage summary

2007-07-09 Thread Chris L. Morrow



On Mon, 9 Jul 2007 [EMAIL PROTECTED] wrote:

> On Mon, 09 Jul 2007 02:18:25 -, "Chris L. Morrow" said:
>
> > While S*BGP seem like they may offer additional protections and additional
> > knobs to be used for protecting 'us' from 'them', the very basics are
> > obviously not being done so added complexity is not going to really help
> > :( Or, perhaps its not that its not going to help its just not going to
> > get done because even prefix-lists are 'too hard', apparently.
>
> "Wow, prefix-lists are *hard*" -- BGP Barbie..

shopping anyone?

>
> You'd think that by now, we as an industry could do better than that.
>

I think that over all, over a goodly period of time, we are... we
occasionally step on the wrong end of the rake still :(

> (Yes, I know the jury is still out on what really happened at L3-Hanaro.

from some other conversations about this, this seems to be a similar
problem to what happened to NY-Edison about 1.5/2 years ago now
(panix.com route hijackage)... 'auto filter from IRR data' without some
form of checking for proper authority.

Of course, now that I stirred the 'l3 shoulda filtered' pot I should
probably also stir the 'large ISP customers should outbound prefix-filter'
 pot. It's very likely that they DO filter outbound, atleast to pref
routes from place to place, perhaps twin failures caught them?

:( I think Marcus, Randy, Steve, Lixia all are getting at an underlying
issue: "The interwebs are not as trivial to the world as they once were"
So more strict control and operational due-dilligence should be on
everyone's plate... Atleast for basics like making sure the routing system
functions properly going forward.

Anyway, should be interesting to get some more details on what happened if
they are ever to become available.

-Chris


Re: Yahoo outage summary

2007-07-09 Thread Valdis . Kletnieks
On Mon, 09 Jul 2007 02:18:25 -, "Chris L. Morrow" said:

> While S*BGP seem like they may offer additional protections and additional
> knobs to be used for protecting 'us' from 'them', the very basics are
> obviously not being done so added complexity is not going to really help
> :( Or, perhaps its not that its not going to help its just not going to
> get done because even prefix-lists are 'too hard', apparently.

"Wow, prefix-lists are *hard*" -- BGP Barbie..

You'd think that by now, we as an industry could do better than that.

(Yes, I know the jury is still out on what really happened at L3-Hanaro.
Doesn't change the fact that we collectively shoot ourselves in the foot
because providers will believe the most implausible things from their
neighbors, like announcements for 128/1 ;)


pgpSfwDHk0JW8.pgp
Description: PGP signature


Re: Vericenter Denver Outtage

2007-07-09 Thread James Baldwin


Outtage with the primary sprintlink connection. Still no ETR.

James Baldwin

On Jul 9, 2007, at 2:09 AM, James Baldwin wrote:


Does anyone have further information on the Vericenter Denver outtage?

Support is aware of the issue, however, they could not feed me a  
problem description at the time of ticket creation or an ETR?


James Baldwin



Vericenter Denver Outtage

2007-07-09 Thread James Baldwin


Does anyone have further information on the Vericenter Denver outtage?

Support is aware of the issue, however, they could not feed me a  
problem description at the time of ticket creation or an ETR?


James Baldwin