from:"Jeff Wheeler"

NANOG58 parking

2013-05-05 Thread Jeff Wheeler

I noticed that some folks were unhappy with the parking fee in Orlando.

The Roosevelt New Orleans, for NANOG 58, tells me that the only
on-site parking is valet for $42/day.  Anyone planning to drive or
stay at a different hotel may want to consider that in advance.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Mitigating DNS amplification attacks

2013-05-01 Thread Jeff Wheeler

On Tue, Apr 30, 2013 at 8:35 PM, Jared Mauch ja...@puck.nether.net wrote:
 Please provide advice and insights as well as directing customers to the 
 openresolverproject.org website. We want to close these down, if you need an 
 accurate list of IPs in your ASN, please email me and I can give you very 
 accurate data.

I think that a public list of open-resolvers is probably overdue, and
the only way to get them fixed.

It is trivial to scan the entire IPv4 address space for DNS servers
that do no throttling even without the resources of a malicious
botnet.

Smurf was only fixed because, as there were fewer networks not
running `no ip directed-broadcast,` the remaining amplification
sources were flooded with huge amounts of malicious traffic.  The
public list of smurf amplifiers turned out to be the only way to
really deal with it.  I predict the same will be true with DNS.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Cloudflare is down

2013-03-04 Thread Jeff Wheeler

On Mon, Mar 4, 2013 at 9:51 AM, Leo Bicknell bickn...@ufp.org wrote:
 will fix the problem.  It won't.  Next time the issue will be
 different, and the same undertrained person who missed the packet
 size this time will miss the next issue as well.  They should all be
 sitting around saying, how can we hire compentent network admins for
 our NOC, but that would cost real money.

I think that is hard because virtually all training / education in our
industry is based on procedures, not on concepts.

Pick up any book about networking and you'll find examples of how to
configure a lab of Cisco 2900s so you can pass an exam.  Very few that
go into conceptual detail or troubleshooting of any kind.  Educational
programs suffer from the same flaw.

There are exceptions to this rule, but they are very few.  I'm sure
many NANOG readers are familiar with Interdomain Multicast Routing,
for example.  It is an excellent book because it covers concepts and
compares two popular vendor platforms on a variety of multicast
topics.

We have lots of stupid people in our industry because so few
understand The Way Things Work.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: 32-bit ASes at routeviews

2012-12-17 Thread Jeff Wheeler

On Mon, Dec 17, 2012 at 6:14 AM, Claudio Jeker cje...@diehard.n-r-g.com
wrote:
 This can happen when a old 2-byte only routers are doing prepends with the
 neighbor address (4-byte). Then the magic in the 4-byte AS RFC to fix up
 ASPATH has no chance to work and you will see 23456.

After a careful re-read of RFC4893 section 4.2.3 Processing Received
Updates, I am fairly sure it is either an implementation issue with the
involved 4-octet ASN routers, or else their transit providers are using
as-path-*expand* when learning their routes for some reason (customers ask
for the strangest things.)

The specification for 4-octet AS refers to old and new BGP speakers,
which I'll do here:

When NEW speaker receives a route from an OLD speaker, its job is to make
AS_PATH and AS4_PATH the same length by using ASNs from from AS_PATH, which
cannot have been inserted into AS4_PATH by the OLD speaker(s) that do not
support the Attribute.

If a NEW speaker implements as-path-prepend incorrectly, and puts 23456
(AS_TRANS) into AS4_PATH instead of his real ASN, then the route passes
through some OLD speakers and out to a NEW one again, the second NEW
speaker has no opportunity to reconstruct the correct path.

On the other hand, if an OLD speaker is configured for as-path-*expand* as
it learns routes from a NEW speaker, then it may insert AS_TRANS into the
AS_PATH but no entries are being pushed to AS4_PATH.  This is a limitation
of the specification and cannot be avoided.  In effect, the use of as-path-*
expand* at a NEW-OLD boundary where the NEW router has a 4-octet ASN and
OLD router is performing *expand* means the correct AS_PATH cannot be
rebuilt.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

OpenBGPd problems relating to misuse of RESERVED bits in BGP Attribute Flags field

2012-11-29 Thread Jeff Wheeler

I had two downstream BGP customers experience problem with an OpenBGPd bug
tonight.  Before diving into detail, I would like to link this mailing list
thread, because this is not a new issue and a patch is available:
http://www.mail-archive.com/misc@openbsd.org/msg115071.html

For the following DFZ routes, I see wrong use of the fifth bit in the
Attribute Flags field:
  Aggregator (7), length: 8, Flags [OT+8]:  AS #68, origin
192.65.95.253
0x:   0044 c041 5ffd
  Updated routes:
128.165.0.0/16
141.111.0.0/16
192.65.95.0/24
192.12.184.0/24
204.121.0.0/16

According to RFC 4271 page 17, the low-order four bits of the Attribute
Flags octet are unused.  They MUST be zero when sent and MUST be ignored
when received.  I read ignored to mean, don't tear down the BGP session
and print a cryptic error that the user probably will be unable to debug.
 The OpenBGPd guys clearly agree and have supplied a patch, so affected
users should visit the above mailing list link, and install it.

Here are my notes for this RFC page and a small diagram of the packet
header, because surprisingly, there isn't one in the RFC already
http://inconcepts.biz/~jsw/img/1121129aa-rfc4271pg17scan.jpg  Sorry about
the poor quality of this, but it is past 3am here, and I know of several
operators (besides my downstream customers) who are experiencing this
problem right now.

If I were someone who is broken by this right now, I would either patch my
OpenBGPd or ask your eBGP neighbors not to send you the above five routes
(filtering it on your own OpenBGPd router probably won't help.)

Thanks, I hope this is helpful
-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Looking for recommendation on 10G Ethernet switch

2012-11-02 Thread Jeff Wheeler

On Fri, Nov 2, 2012 at 11:13 AM, Eric Germann egerm...@limanews.com wrote:
 I'm looking for a recommendation on a smallish 10G Ethernet switch for a
 small virtualization/SAN implementation (4-5 hosts, 2 SAN boxes) over
 iSCSI with some legacy boxes on GigE.

 1Gbps.   Assessing whether it is better to go 10G now vs. multi-pathing
 with quad GigE cards.  Trying to find the best solution for  1G on a
 trunk and  $50K per box.

Certainly the days of doing NxGE to servers should be behind us.
There are many good 10GE switch offerings.  The Juniper QFX and Arista
(insert one of three good product lines here) have been excellent for
us.  We like them for different reasons -- Arista is quite good if you
want to integrate with a provisioning system; QFX is our choice when
most provisioning is done manually.  Both are way under $50k per box.

The biggest difference between the TOR-style switches and chassis
offerings, aside from the obvious, is buffers.  All the TOR-type 10G
switches have really small buffers and that can be a performance issue
for iSCSI when utilization is high.  Most of the chassis-type switches
have very generous buffers so you do not run into problems with
micro-bursting, etc.  The vendors will all tell you about lossless
ethernet, flow control, etc. and that crap sounds great on paper.  Try
making it actually work.  You'll want those days of your life back. :)

$0.02.
-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Flood affecting US east coast communication facilities?

2012-10-30 Thread Jeff Wheeler

On Tue, Oct 30, 2012 at 3:46 AM, Kauto Huopio ka...@huopio.fi wrote:
 Any reports on damage to communications facilities on US east coast?

Yes.  The outages list is a better place to look for this information.

https://puck.nether.net/pipermail/outages/2012-October/date.html

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 Address allocation best practises for sites.

2012-09-24 Thread Jeff Wheeler

On Mon, Sep 24, 2012 at 6:52 PM, John Mitchell mi...@illuminati.org wrote:
 Does the best practise switch to now using one IPv6 per site, or still the
 same one IPv6 for multi-sites?

Certainly it would be nice to have IPv6 address per vhost.  In many
cases, this will be practical.

It also sometimes will NOT be practical.

Imagine that I am one of the rather clueless hosting companies who are
handing out /64 networks to any customer who asks for one, and using
NDP to find the machine using each address in the /64.  Churn problems
aside, if you have any customer doing particularly dense virtual
hosting, say a few thousand IPv6 addresses on his one or more
machines, then he will use up the whole NDP table for just himself.
You probably won't want to be a customer on the same layer-3 device as
that guy.  Now that there might be dozens of VMs per physical server
and maybe 40 physical servers per each top-of-rack device, you can
quickly exhaust all of your NDP entries even with normal, legitimate
uses like www virtual hosting.

Now imagine the hosting company has decided the stacking trend is a
good idea, and stacked up a row of 10 EX4200s so they can all share
the same configuration, uplinks, etc.  They also share the same NDP
table, so it will be quite easy to run out of NDP (there is only room
for a few thousand entries) not just on one top-of-rack switch, but on
the whole row.

Further, imagine you decided to use a 6500 for a room full of
customers, or even your whole datacenter, which will often work just
fine for IPv4.  Suddenly it won't for IPv6, because each customer may
want to make hundreds of NDP entries for his various virtual-hosts.
Just one busy customer with a lot of virtual hosting will run out a
resource shared by every other customer.

So yes, having an IPv6 address per each www virtual-host is certainly
a nice idea.  If you have to use NDP to get your addresses to your web
server, though, it might not be practical.  It certainly will be
foolish in a dedicated server type of environment where you are
renting individual machines or VMs and not owning your own layer-3
box.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Bell Canada outage?

2012-08-08 Thread Jeff Wheeler

On Wed, Aug 8, 2012 at 2:35 PM, Chris Stone axi...@gmail.com wrote:
 Outages mailing list is reporting that Tata is having problems in Montreal
 affecting 'many routers'...maybe this is related?

I am a transit customer of both TATA and Bell Canada.  We saw route
churn and heavy packet loss via both Bell and TATA beginning at
approximately 13:25 ET.  It took us some time to assess the situation
and deactivate both our TATA and Bell BGP sessions.

It also took over 10 minutes for my BGP withdraws to propagate from
Bell to their neighbors, including Level3.  I would guess Bell
Canada's routers all have very busy CPU.

No information from either NOC yet.
-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Bell Canada outage?

2012-08-08 Thread Jeff Wheeler

We have been advised that TATA/6453 is back to normal, and
re-activated our BGP to them.  Everything seems okay on this front.

No update from Bell Canada yet.

On Wed, Aug 8, 2012 at 4:11 PM, Harald Koch c...@pobox.com wrote:
 On 8 August 2012 16:10, Zachary McGibbon
 Thanks for the info, looks like Bell needs to put some filtering on their
 customer links!

 I remember when AS577 had those... ;)

We actually have asked Bell Canada not to filter routes from us, and
use prefix-limit only, because they were not able to build a
prefix-list for us if we have any downstream customer ASNs, which we
do.  :-/

If someone at Bell Canada is reading and cared to contact me off-list,
I would sure love to get my own route filtering fixed.  I have had
little success through the normal channels.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: POTS Ending (Re: Operation Ghost Click)

2012-05-07 Thread Jeff Wheeler

On Wed, May 2, 2012 at 11:29 PM, Jared Mauch ja...@puck.nether.net wrote:
 http://www.usatoday.com/news/nation/story/2012-04-16/landline-service-becoming-obsolete/54321184/1

Indiana is doing away with its requirement that the incumbent LECs
supply voice service to rural areas.  Indiana also used to require a
telephone, and posted emergency numbers for the nearest fire, police,
and ambulance service (or 911) near any swimming pool.  The entire
section of state code regarding residential swimming pools has now
been eliminated.  A victory for rural swimming-pool owners everywhere,
now people can drown in home swimming pools with no land lines nearby
to call for help.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

filtering /48 is going to be necessary

2012-03-09 Thread Jeff Wheeler

On Fri, Mar 9, 2012 at 3:23 AM, Mehmet Akcin meh...@akcin.net wrote:
 if you know anyone who is filtering /48 , you can start telling them to STOP 
 doing so as a good citizen of internet6.

I had a bit of off-list discussion about this topic, and I was not
going to bring it up today on-list, but since the other point of view
is already there, I may as well.

Unless you are going to pay the bill for my clients to upgrade their
3BXL/3CXL systems (and similar) to XXL and then XXXL, I think we need
to do two things before IPv6 up-take is really broad:

1) absolutely must drop /48 de-aggregates from ISP blocks
2) absolutely must make RIR policy so orgs can get /48s for
anycasting, and whatever other purposes

If we fail to adjust RIR policy to account for the huge amount of
accidental de-aggregation that can (and will) happen with IPv6, we
will eventually have to do #1 anyway, but a bunch of networks will
have to renumber in order take advantage of #2 down the road.

The way we are headed right now, it is likely that the IPv6 address
space being issued today will look like the swamp in a few short
years, and we will regret repeating this obvious mistake.

We had this discussion on the list exactly a year ago.  At that time,
the average IPv6 origin ASN was announcing 1.43 routes.  That figure
today is 1.57 routes per origin ASN.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: L3 VPN Management

2012-03-07 Thread Jeff Wheeler

On Wed, Mar 7, 2012 at 2:07 AM, Leigh Porter
leigh.por...@ukbroadband.com wrote:
 What's the nicest way of allowing the ops servers all talk to each VPN 
 instance? At the moment I just us pretty normal L3VPN techniques so that 
 every VPN sees routes tagged with the ops VPN target community and so that 
 the ops VPN sees all the other VPN routes but the division between VPNs is 
 maintained.

 Or, would it be nicer to have the firewall have a foot in each VPN, advertise 
 routes to ops systems to each VPN instance and receive routes from all the 
 other VPNs?

I think you may pay more money for extra firewall zones and perhaps
not receive any benefit from it.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

common time-management mistake: rack stack

2012-02-16 Thread Jeff Wheeler

Randy's P-Touch thread brings up an issue I think is worth some
discussion.  I have noticed that a lot of very well-paid, sometimes
well-qualified, networking folks spend some of their time on rack 
stack tasks, which I feel is a very unwise use of time and talent.

Imagine if the CFO of a bank spent a big chunk of his time filling up ATMs.
Flying a sharp router jockey around to far-flung POPs to install gear
is just as foolish.

Not only does the router jockey cost a lot more to employ than a CCNA,
but if your senior-level talent is wasting time in airports and IBXes,
that is time they can't be doing things CCNAs can't.

I was once advising a client on a transit purchasing decision, and a
fairly-large, now-defunct tier-2 ISP was being considered.  We needed
a few questions about their IPv6 plans answered before we were
comfortable.  The CTO of that org was the only guy who was able to
answer these questions.  After waiting four days for him to return our
message, he reached out to us from an airplane phone, telling us that
he had been busy racking new routers in several east-coast cities (his
office was not east-coast) and that's why he hadn't got back to us
yet.

As you might imagine, the client quickly realized that they didn't
want to deal with a vendor whose CTO spent his time doing rack  stack
instead of engineering his network or engaging with customers.  If he
had simply said he was on vacation, we would never have known how
poorly the senior people at that ISP managed their time.

With apologies to Randy, let the CCNAs fight with label makers.
-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Common operational misconceptions

2012-02-15 Thread Jeff Wheeler

On Wed, Feb 15, 2012 at 3:47 PM, John Kristoff j...@cymru.com wrote:
 I have a handful of common misconceptions that I'd put on a top 10 list,

By your classful addressing example, it sounds like these students are
what most nanog posters would consider to be entry-level.

RFC1918 is misused a lot by entry-level folks, most seem not to know
about 172.16.0.0/12

I think students should be able to learn how traceroute actually
works, which I have found, is a lot easier to teach as a conceptual
lesson than by just telling them maybe the problem is in the return
path without giving them any understanding of how or why.

MTU, Path MTU Detection, and MSS

NxGE isn't a serial 4Gbps link, and why this is so important

On the other hand, more than half of the CCIEs I have worked with are
clueless about all of the above.  :-/
-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: UDP port 80 DDoS attack

2012-02-06 Thread Jeff Wheeler

On Mon, Feb 6, 2012 at 8:43 PM, Sven Olaf Kamphuis s...@cb3rob.net wrote:
 there is a fix for it, it's called putting a fuckton of ram in -most-
 routers on the internet and keeping statistics for each destination
 ip:destination port:outgoing interface so that none of them individually can
 (entirely/procentually compared to other traffic) flood the outgoing
 interface on that router... end result, if enough routers are structured
 like that, is that ddos attacks will be come completely useless.

There are two obvious problems with your approach.

First, adding the policers you suggest, at the scale needed, is a
little harder than you imagine.  It's not a simple matter of the cost
of RAM but also power/heat density per port.

Second, if you re-engineer every router on the Internet to prevent an
interface from being congested by malicious flow(s) destined for one
particular destination IP:port, then DDoS attacks will simply target
multiple ports or multiple destination IP addresses that are likely to
traverse a link they are able to congest.

If you want to dramatically increase the cost of routers in order to
solve the problem of DDoS with one deft (and expensive) move, you have
to imagine that the people behind DDoS attacks aren't complete idiots,
and will actually spend some time thinking about how to defeat your
system.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: UDP port 80 DDoS attack

2012-02-05 Thread Jeff Wheeler

On Sun, Feb 5, 2012 at 10:08 PM, Steve Bertrand
steve.bertr...@gmail.com wrote:
 This is so very easily automated. Even if you don't actually want to trigger
 the routes automatically, finding the sources you want to blackhole is as

What transit providers are doing flow-spec, or otherwise, to allow
their downstreams to block malicious traffic by SOURCE address?

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Verisign deep-hacked. For months.

2012-02-02 Thread Jeff Wheeler

On Thu, Feb 2, 2012 at 7:26 PM, Suresh Ramasubramanian
ops.li...@gmail.com wrote:
 So what part of VRSN got broken into?  They do a lot more than just DNS.

Indeed, VeriSign owns Illuminet, who are mission-critical for POTS.
Illuminet is also in the business of recording telephone calls, SMS
messages, etc. for law enforcement.

That means that a breach at VeriSign could be nothing, or it could
give bad guys access to a lot more than any breach or leak reported to
date.  Who knows?
-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: MD5 considered harmful

2012-01-27 Thread Jeff Wheeler

On Fri, Jan 27, 2012 at 6:35 PM, Keegan Holley
keegan.hol...@sungard.com wrote:
 realizes that it's ok to let gig-e auto-negotiate.  I've never really
 seen MD5 cause issues.

I have run into plenty of problems caused by MD5-related bugs.

6500/7600 can still figure the MSS incorrectly when using it.  It used
to be possible for that particular box to send over-sized frames out
Ethernet ports with MD5 enabled, which of course were likely to be
dropped by the neighboring router or switching equipment (perhaps even
carrier Ethernet equipment.)  Obviously that can be a chore to
troubleshoot.

Sometimes we choose to use it.  Sometimes we don't.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: subnet prefix length 64 breaks IPv6?

2011-12-28 Thread Jeff Wheeler

On Wed, Dec 28, 2011 at 10:19 AM, Ray Soucy r...@maine.edu wrote:
 There are a few solutions that vendors will hopefully look into.  One
 being to implement neighbor discovery in hardware (at which point
 table exhaustion also becomes a legitimate concern, so the logic
 should be such that known associations are not discarded in favor of
 unknown associations).

Even if that is done you are still exposed to attacks -- imagine if a
downstream machine that is under customer control (not yours) has a
whole /64 nailed up on its Ethernet interface, and happily responds to
ND solicits for every address.  Your hardware table will fill up and
then your network has failed -- which way it fails depends on the
table eviction behavior.

Perhaps this is not covered very well in my slides.  There are design
limits here that cannot be overcome by any current or foreseen
technology.  This is not only about what is broken about current
routers but what will always be broken about them, in the absence of
clever work-arounds like limits on the number of ND entries allowed
per physical customer port, etc.

We really need DHCPv6 snooping and ND disabled for campus access
networks, for example.  Otherwise you could give out addresses from a
limited range in each subnet and use an ACL (like Owen DeLong suggests
for hosting environments -- effectively turning the /64 into a /120
anyway) but this is IMO much worse than just not configuring a /64.

On Wed, Dec 28, 2011 at 10:45 AM,  sth...@nethelp.no wrote:
 I'm afraid I don't believe this is going to happen unless neighbor
 discovery based attacks become a serious problem. And even then it would
 take a long time.

The vendors seem to range from huh? to what is everyone else
doing? to Cisco (the only vendor to make any forward progress at all
on this issue.)  I think that will change as this topic is discussed
more and more on public mailing lists, and as things like DHCPv6
snooping, and good behavior when ND is disabled on a subnet/interface,
begin to make their way into RFPs.

As it stands right now, if you want to disable the IPv6 functionality
(and maybe IPv4 too if dual-stacked) of almost any datacenter /
hosting company offering v6, it is trivial to do that.  The same is
true of every IXP with a v6 subnet.  I think once some bad guys figure
this out, they will do us a favor and DoS some important things like
IXPs, or a highly-visible ISP, and give the vendors a kick in the
pants -- along with operators who still have the /64 or bust
mentality, since they will then see things busting due to trivial
attacks.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: subnet prefix length 64 breaks IPv6?

2011-12-28 Thread Jeff Wheeler

On Wed, Dec 28, 2011 at 5:07 PM, Ray Soucy r...@maine.edu wrote:
 The suggestion of disabling ND outright is a bit extreme.  We don't
 need to disable ARP outright to have functional networks with a
 reasonable level of stability and security.  The important thing is

I don't think it's at all extreme.  If you are dealing with an access
network where DHCPv6 is the only legitimate way to get an address on a
given LAN segment, there is probably no reason for the router to use
ND to learn about neighbor L3L2 associations.  With DHCPv6 snooping
the router can simply not use ND on that segment, which eliminates
this problem.  However, this feature is not yet available.

It would also be difficult to convince hosting customers to use a
DHCPv6 client to populate their gateway's neighbor table.  However, if
this feature comes along before other fixes, it will be a good option
for safely deploying /64s without ND vulnerabilities.

 that we work with vendors to get a set of tools (not just one) to
 address these concerns.  As you pointed out Cisco has already been
 doing quite a bit of work in this area, and once we start seeing the
 implementations become more common, other vendors will more than
 likely follow (at least if they want our business).

 Maybe I'm just a glass-half-full kind of guy. ;-)

I think your view of the Cisco work is a little optimistic. :)  What
they have done so far is simply acknowledge that, yes, ND exhaustion
is a problem, and give the customer the option to mitigate damage to
individual interfaces / VLANs, on the very few platforms that support
the feature.

Cisco has also given the SUP-2T independent policers for ARP and ND,
so if you have a SUP-2T instead of a SUP720 / RSP720, your IPv4 won't
break when you get an IPv6 ND attack.  Unfortunately, there are plenty
of people out there who are running IPv6 /64s on SUP720s, most who do
not know that an attacker can break all their IPv4 services with an
IPv6 ND attack.

 The most important thing is that network operators are aware of these
 issues, have a basic understanding of the implications, and are
 provided with the knowledge and tools to address them.

We certainly agree here.  I am glad the mailing list has finally moved
from listening to Owen DeLong babble about this being a non-problem,
to discussing what work-arounds are possible, disadvantages of them,
and what vendors can do better in the future.

My personal belief is that DHCPv6 snooping, with ND disabled, will be
the first widely-available method of deploying /64s safely to
customer LAN segments.  I'm not saying this is good but it is a
legitimate solution.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: De-bogon not possible via arin policy.

2011-12-15 Thread Jeff Wheeler

On Thu, Dec 15, 2011 at 4:54 PM, Joel jaeggli joe...@bogus.com wrote:
 We know rather alot about the original posters' business, it has ~34
 million wireless subscribers in north america. I think it's safe to
 assume that adequate docuementation could be provided.

I missed the post where he supplied this information.  I guess his
company should have cheated the system, with full complicity of ARIN,
like Verizon Wireless did a few years ago.
http://marc.info/?l=nanogm=123406577704970w=4

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: local_preference for transit traffic?

2011-12-14 Thread Jeff Wheeler

On Thu, Dec 15, 2011 at 1:07 AM, Keegan Holley
keegan.hol...@sungard.com wrote:
 Had in interesting conversation with a transit AS on behalf of a customer
 where I found out they are using communities to raise the local preference

That sounds like a disreputable practice.

While not quite as obvious, some large transit ASes, like Level3,
reset the origin to I (best) sometime between when they learn it and
when they announce it to their customers and peers.  This similarly
causes them to suck in a bit more traffic than they might otherwise.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: local_preference for transit traffic?

2011-12-14 Thread Jeff Wheeler

On Thu, Dec 15, 2011 at 2:24 AM, Keegan Holley
keegan.hol...@sungard.com wrote:
 I always assumed that taking in more traffic was a bad thing.  I've heard
 about one sided peering agreements where one side is sending more traffic
 than the other needs them to transport. Am I missing something?  Would this
 cause a shift in their favor allowing them to offload more customer traffic
 to their peers without complaint?

Well, if Level3 wanted less ingress traffic, they would probably stop
this practice.  I would imagine they thought about it carefully.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Writable SNMP

2011-12-06 Thread Jeff Wheeler

On Tue, Dec 6, 2011 at 11:07 AM, Keegan Holley
keegan.hol...@sungard.com wrote:
 For a few years now I been wondering why more networks do not use writable
 SNMP.  Most automation solutions actually script a login to the various

I've spent enough time writing code to deal with SNMP (our own stack,
not using Net-SNMP or friends) to have a more in-depth understanding
of SNMP's pitfalls than most people.  It is TERRIBLE and should be
totally gutted and replaced with something more sane, less likely to
have bugs, etc.  There is a good reason why many major bugs have
popped up over the years allowing devices to be crashed with crafted
SNMP packets -- it's honestly not that easy to get right, especially
if you really implement every possible encoding so some random
customer with a brain-damaged SNMP client stack won't come crying to
you that his client won't work.

Juniper does not support writing via SNMP.  I am glad.  Hopefully that
is the first step toward not supporting SNMP at all.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Link local for P-t-P links? (Was: IPv6 prefixes longer then /64: are they possible in DOCSIS networks?)

2011-12-01 Thread Jeff Wheeler

On Wed, Nov 30, 2011 at 9:15 PM, Mike Jones m...@mikejones.in wrote:
 Link-Local?

 For true P-t-P links I guess you don't need any addresses on the

Point-to-point links in your backbone are by far the easiest thing to
defend against this attack.  I wish we would steer the discussion away
from point-to-point links that are entirely within the control of the
operator, as this is really quite well understood.  Major ISPs
including Level3 are already doing /126 to their customers today as
well.  In fact, Level3 does not even reserve a /64, they will hand out
::0/126 to one customer on a given access router, ::4/126 to the next.
 It clearly works.

The access layer for non point-to-point customers, on the other hand,
is less well-understood.  That's why we keep having these discussions.
 Getting customers (and their device/software) to work correctly with
link-local addressing and DHCP-PD or similar is going to be an uphill
battle in a hosting environment.  It also breaks down immediately if
the hosting customer, for example, wishes to use ND to be able to
provision addresses on two or more servers from a common subnet.  So
there are both perception and practical problems / limitations with
this approach.  I'm not saying it's a bad idea, but it won't work in
some instances.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 prefixes longer then /64: are they possible in DOCSIS networks?

2011-12-01 Thread Jeff Wheeler

On Thu, Dec 1, 2011 at 9:42 AM, Chuck Anderson c...@wpi.edu wrote:
 Jumping in here, how about static ND entries?  Then you can use the
 /64 for P-t-P, but set the few static ND entries you need, and turn
 off dynamic ND.  An out-of-band provisioning system could add static
 ND entries as needed.

 Another idea, perhaps more useful for client LANs, would be to have a
 fixed mapping between IPv6 IID and MAC address.  Use DHCPv6 to force

Chuck, you are certainly correct that if ND resolution can be
deactivated for an interface, there won't be an ND exhaustion problem
on it.  That's why I characterize the problem as ND exhaustion.  :-)
Whether or not this is practical for a given environment is up to the
operators to decide.

I, for example, know it is much easier for me to configure a /126
P-t-P than keep static ND entries and disable ND on those links.  In
my own backbone, your suggestion can be practical, but what about
customer links?  If the customer changes his device, he may present a
different MAC address to my interface.  Then I've got a static ND
entry pointing to his old MAC address... resulting in a ticket, and
ops work, which would not have been necessary with a simple /126.

DHCPv6 with snooping and learning disabled would be great for the
datacenter LAN if I thought I could get customers to bite off on it.
When vendors begin delivering this feature it is something we will
strongly consider.  I don't know if customers will prefer to have this
and need to run a DHCPv6 client, or prefer to have a /120 (or similar)
for the approximate number of addresses they plan to use.

I am not closed to alternatives.  I want to give my customers /64s as
soon as it becomes practical and production-ready.  That is why we
always reserve a /64 for each subnet, even though it is provisioned as
a longer subnet.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 prefixes longer then /64: are they possible in DOCSIS networks?

2011-11-30 Thread Jeff Wheeler

On Wed, Nov 30, 2011 at 9:48 AM, Ray Soucy r...@maine.edu wrote:
 1. Using a stateful firewall (not an ACL) outside the router
 responsible for the 64-bit prefix.  This doesn't scale, and is not a
 design many would find acceptable (it has almost all the problems of
 an ISP running NAT)

Owen has suggested stateful firewall as a solution to me in the
past.  There is not currently any firewall with the necessary features
to do this.  We sometimes knee-jerk and think stateful firewall has
gobs of memory and can spend more CPU time on each packet, so it is a
more likely solution.  In this case that does not matter.  You can't
have 2^64 bits of memory.

You could make a firewall with the needed features (or a layer-3
switch), but it would have to be the layer-3 gateway of the subnets
you are protecting (not an upstream device) and it would need
knowledge of all addresses in use on the subnet, which must fit within
its ND table limits.  Only DHCP snooping can do this and customers are
not exactly keen on receiving DHCP-assigned addresses in mixed
datacenter environments, even if the addresses are static ones.  Once
you do that, you need to limit the number of addresses that can be
leased to each customer to far less than a /64 anyway.  All you gain
by having all that complexity is the appearance of bigger subnets,
when in reality, they are non-functional except for the limited number
of addresses which are actively leased out.

Again the arguments for /64 are not promising.  It is much less
complicated to simply deploy a longer subnet.

On Wed, Nov 30, 2011 at 11:13 AM, Jimmy Hess mysi...@gmail.com wrote:
 On Wed, Nov 30, 2011 at 8:48 AM, Ray Soucy r...@maine.edu wrote:
 Saying you can mitigate neighbor table exhaustion with a simple ACL
 is misleading (and you're not the only one who has tried to make that
 claim).

 It's true, though, you can.

 From a network design POV, there may still be reasons to prefer the ACL 
 method.
 They better be good reasons, such as a requirement for SLAAC on a large LAN.

No, Jimmy, you can't do that with SLAAC.  I do not think you
understand the problem.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 prefixes longer then /64: are they possible in DOCSIS networks?

2011-11-30 Thread Jeff Wheeler

On Wed, Nov 30, 2011 at 3:13 PM, Owen DeLong o...@delong.com wrote:
 As such, I prefer to deploy IPv6 as it is today and resolve the bugs
 and the security issues along the way (much like we did with IPv4).

Why is the Hurricane Electric backbone using /126 link-nets, not /64?
You used to regularly claim there are significant disadvantages to
longer subnets.  At best, you are still claiming there are no
advantages.  These are lies.  Please, Owen, tell us why you aren't
practicing what you preach.

 I haven't said that security issues should be ignored, either. Just that
 they should be viewed in a proper context and assessed with a realistic
 evaluation of the magnitude of the risk and the difficulty of mitigation.

You repeatedly claim that ND exhaustion is a non-issue.  You also
claim you have secret sauce to mitigate attacks.  This, after you
previously claimed that you were using common ACLs to mitigate
attacks, and I showed you how that cannot be true.  Your understanding
of this problem has rocketed from totally clueless to having secrets
you can't discuss.  Except it isn't, because you are also advocating
... denying all traffic to all subnets except the first few hundred
addresses.  What a stellar plan!

Just stop telling lies about this, Owen.  That's all I'm asking.

You, personally, are part of the problem.  If the guy who is supposed
to be the public-facing technical outreach guy for the self-described
leader in IPv6 transit/hosting/etc services continues to go around
claiming this is a non-issue, when it very clearly is, that is
destructive, not helpful.

 What has also been lost here is that my description of the various
 mitigation tactics for ND exhaustion attacks depends on the type
 of network being protected. Strategies that work for point-to-point
 links (simple ACLs at the borders in most environments, for
 example) are not the same as strategies that work to protect
 client LANs (stateful firewalls with default deny inbound) or
 strategies necessary to protect server LANs (slightly more complex
 ACLs and other tactics).

You have no such simple ACLs at the borders on the Hurricane
Electric network.  In fact, your mitigation mechanism for the backbone
is exactly what I recommend: deploy longer subnets.  You don't have
any mitigation mechanism for your hosting services, other than
whack-a-mole.

If anyone has trouble believing me, you can do what I did, and email
Owen off-list.  You can say, Owen, I'd like to subscribe to a
Hurricane Electric dedicated server, get myself a /64, and DoS my own
subnet, to see if that affects my box or any other nearby customers.
The reply you'll get will be that your box will be powered off,
because they have no mitigation strategy.

Arguing in the abstract is all fun and games, but when you ask Owen to
show you something that works in a real-world, production environment,
he can't.  That's because Owen's network design is not suitable for
production use in his own environment with routers he claims to have
selected in part based on their performance under ND attacks (another
lie.)

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 prefixes longer then /64: are they possible in DOCSIS networks?

2011-11-29 Thread Jeff Wheeler

On Tue, Nov 29, 2011 at 1:43 AM,  valdis.kletni...@vt.edu wrote:
 It's worked for us since 1997.  We've had bigger problems with IPv4 worms

That's not a reason to deny that the problem exists.  It's even
fixable.  I'd prefer that vendors fixed it *before* there were massive
botnet armies with IPv6 connectivity, but in case they don't, I do not
deploy /64.

On Tue, Nov 29, 2011 at 2:20 AM, Jonathan Lassoff j...@thejof.com wrote:
 Agreed. While I don't have any good numbers that I can publicly offer up, it
 also intuitively makes sense that there's a greater proportion of IPv4 DDOS
 and resource exhaustion attacks vs IPv6 ones.

Of course.  There are comparably few hosts with IPv6 connectivity.
Bad guys aren't that familiar with IPv6 yet.  Even if they are, their
armies of compromised desktops probably can't launch an effective IPv6
attack yet.  Lack of sources, no way to get nasty IPv6 packets to the
target, or the target has different infrastructure for IPv4 and IPv6
anyway, and taking out the IPv6 one only isn't that beneficial (Happy
Eyeballs features and such.)

Further, the victim can just turn off IPv6 when they start getting
attacked in this way.  And that is exactly what sites will end up
doing, turning off IPv6 because vendors aren't addressing issues like
these.  That doesn't help anyone.

 I imagine the mitigation strategies are similar for both cases though: just
 rate-limit how often your router will attempt neighbor discovery. Are there
 other methods?

Simply rate-limiting the data-plane events that trigger ND resolution
is not good enough.  One very popular platform that is offered with
cards in horizontal or vertical orientation uses the same policer for
ARP and NDP.  That means when you do eventually start getting ND
attacks, it will break your IPv4 services also.

If you want to learn more about this, I have some slides:
http://inconcepts.biz/~jsw/IPv6_NDP_Exhaustion.pdf

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 prefixes longer then /64: are they possible in DOCSIS networks?

2011-11-29 Thread Jeff Wheeler

On Tue, Nov 29, 2011 at 12:42 AM, Owen DeLong o...@delong.com wrote:
 That's _NOT_ a fair characterization of what I said above, nor is it
 a fair characterization of my approach to dealing with neighbor table
 attacks.

Here are some direct quotes from our discussion:
 Since we have relatively few customers per aggregation router, I don't
 think you'd be nearly as successful as you think you would.

 On the platforms we use, it won't spill over into IPv4 breakage. Additionally,
 it will break fewer than 50 other dedicated servers and no other customers.
 This will be tolerated for about 5 minutes while our support department
 receives the alarm and looks into the cause, then, your dedicated server
 won't have power any more and the problem will go away along with your
 account.

At no time did you ever suggest you had any idea how to systemically
solve this problem.  Now you are claiming that you have a more
global approach, but this is another of your lies.

 All of our support guys have enough clue to get to this quickly enough
 and our monitoring systems would detect the abnormally large
 neighbor table fairly early in the process.

Your monitoring systems keep an eye on the size of your ND tables?
How can this be true if you believe that ND attacks are not an issue?
Did you just throw resources at monitoring this for no reason?  Do you
really even poll or alert on this data, or were you simply telling
another lie?

 Additionally, we have a network engineer on duty 24x7, so, even
 if the support guys don't figure it out correctly, there's backup with
 clue right behind them in the same room.

That is exactly NOC whack-a-mole.

 What I said above is that if you allow random traffic aimed at your
 point to point links, you're doing something dumb.

If your network has nothing but point-to-point links, it is easy to
defend.  Sadly that is not how you or I interface with many of our
customers.

In addition, you don't actually practice what you preach.  Hurricane
Electric uses /126 networks in its backbone.  Why are you not rushing
to change these to /64?  After all, you regularly tell us about the
supposed disadvantages of /126 on point-to-point links.  What are
these disadvantages?

 As to my actual plan for dealing with it, what I said was that if we
 ever see a neighbor table attack start causing problems with services,
 we'd address it at that time. Likely we would address it more globally
 and not through a whack-a-mole process.

No, this is not what you said.  Again, you are simply telling lies.

 I did not give details of all of our mitigation strategies, nor can I.

Yes you did.  Your strategy is whack-a-mole.

 What I can say is that we do have several /64s that could be attacked
 such that we'd notice the attack. Most likely the attack wouldn't break
 anything that is a production service before we could mitigate it.

Breaking about 50 customers for as long as it takes your support staff
or NOC to troubleshoot, in your mind, muts not be breaking anything
that is a production service, or else before we could mitigate it
means you have figured out how to travel through time.

 In more than a decade of running production IPv6 networks, we have
 yet to see a neighbor table attack. Further, when you consider that
 most attacks have a purpose, neighbor table attacks are both more
 difficult and less effective than other attack vectors that are readily
 available. As such, I think they are less attractive to would-be attackers.

Again, the bad guys don't have much motive (yet) since few services
of interest share common IPv4 and IPv6 infrastructure today.  That
will change.

 No, there is a third possibility.

 I don't mind you taking a frank private discussion public (though
 it's not very courteous), but, I do object to you misquoting me
 and misconstruing the nature and substance of what I said.

It's disingenuous of you to continue to lie every time this topic
comes up on the mailing list.

 Yes, ND attacks are possible if you leave your /64 wide open to
 external traffic. However, if you're using your /64 to provide services,
 chances are it's pretty easy to cluster your server in a much smaller
 range of addresses. A simple ACL that only permits packets to
 that range (or even twice or 4 times that range) will effectively
 block any meaningful ND attack.

 For example, let's say you use 2001:db8:fe37:57::1000:0
 to 2001:db8:fe37:57:1000:01ff as the IPv6 range for a
 set of servers.

 Let's say there are 200 servers in that range.

 That's 200/512 good ND records for servers and 312 slots
 where you can put additional servers. That gives you a total
 attack surface of 312 incomplete ND records.

 This was part of my frank private discussion with Jeff, but,
 he seems to have forgotten it.

Since I've re-read our earlier discussion (unlike you) I can state
with certainty that it was not part of our earlier discussion.  If it
was, I would be happy to tell everyone that your plan for deploying
IPv6 to

Re: IPv6 prefixes longer then /64: are they possible in DOCSIS networks?

2011-11-28 Thread Jeff Wheeler

On Mon, Nov 28, 2011 at 4:51 PM, Owen DeLong o...@delong.com wrote:
 Technically, absent buggy {firm,soft}ware, you can use a /127. There's no
 actual benefit to doing anything longer than a /64 unless you have
 buggy *ware (ping pong attacks only work against buggy *ware),
 and there can be some advantages to choosing addresses other than
 ::1 and ::2 in some cases. If you're letting outside packets target your
 point-to-point links, you have bigger problems than neighbor table
 attacks. If not, then the neighbor table attack is a bit of a red-herring.

Owen and I have discussed this in great detail off-list.  Nearly every
time this topic comes up, he posts in public that neighbor table
exhaustion is a non-issue.  I thought I'd mention that his plan for
handling neighbor table attacks against his networks is whack-a-mole.
That's right, wait for customer services to break, then have NOC guys
attempt to clear tables, filter traffic, or disable services; and
repeat that if the attacker is determined or going after his network
rather than one of his downstream customers.

I hate to drag a frank, private discussion like that into the public
list; but every time Owen says this is a non-issue, you should keep in
mind that his own plan is totally unacceptable for any production
service.  Only one of the following things can be true: either 1) Owen
thinks it is okay for services to break repeatedly and require
operator intervention to fix them if subjected to a trivial attack; or
2) he is lieing.  Take that as you will.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Anyone seen this kind of problem? SIP traffic not getting to destination but traceroute does

2011-11-09 Thread Jeff Wheeler

On Wed, Nov 9, 2011 at 1:47 PM, Jay Nakamura zeusda...@gmail.com wrote:
 So my questions is, is it possible there is some kind of filter at
 Qwest or Level 3 that is dropping traffic only for udp 5060 for select
 few IPs?  That's the only explanation I can come up with other than

I ran into exactly this problem last week with Rogers.  All traffic
from the client except udp/5060 could be received by us, and udp/5060
was blocked.  We tested other IP addresses on our (provider) side and
did not find any blocking there, so we assigned a new IP to the SIP
gateway.  I hardly think this can be an ordinary malfunction, but good
luck getting a phone company to troubleshoot a problem with their
subscribers using mobile data to connect to a third-party voice
gateway...

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: BGP conf

2011-11-02 Thread Jeff Wheeler

On Wed, Nov 2, 2011 at 7:50 PM, Edward avanti edward.ava...@gmail.com wrote:
 sorry, my english not so perfect, at no time I mean send to IX what Verizon
 send me, I'm not THAT stupid hehe
 I mean if destination/origin is via IX, then send THAT traffic only by IX
 and not Verizon.

I understood what you mean.  The recommendations in my earlier reply
are still the best ones you've received:
1) hire a consultant to assist you both now and with any future problems
or 2) do not worry about being multi-homed, because the extra
complexity will do you more harm than good

Imagine if you took your car to a shop and asked for new tires, and
the mechanic said, well, I have never changed tires before and I'm
not sure I have the right tools, but if you give me a couple of days I
think I can read about it on the Internet and figure it out.  Of
course you would not buy tires from him, you would go to another shop.
 That mechanic would quickly find that, if he wants to sell tires, he
needs to learn how to install them or hire someone to do it for him.

What you are asking your boss/company to do is trust you to put tires
on their car without the right tools or knowledge.  The result of that
is probably how your network will end up: a wreck.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: BGP conf

2011-11-02 Thread Jeff Wheeler

On Wed, Nov 2, 2011 at 8:44 PM, Jack Bates jba...@brightok.net wrote:
 Now I have the mile long monstrosity that uses BGP communities for
 everything, and of route-maps/policies with prefix-lists for downstream
 customers. You have to start somewhere.

 cymru secure bgp templates is probably a good beginning.

I guess ten years of watching RIRs and users de-bogon new /8s didn't
teach you why those Cymru examples are more dangerous than they are
good.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: BGP conf

2011-11-02 Thread Jeff Wheeler

On Wed, Nov 2, 2011 at 10:04 PM, Jack Bates jba...@brightok.net wrote:
 Have to read the current cymru bgp templates?

 ! manner. Why not consider peering with our globally distributed bogon
 ! route-server project? Alternately you can obtain a current and well

I'm not telling you something you don't already know, but for the
novices who regard this list as a source of expertise, I will explain
in greater detail why this is a really dumb idea.

If you took a list of bogons over eBGP from Cymru, you would get
unused /8s and similar.  What you don't get is a route that matches
whatever silly thing someone on the DFZ accidentally leaked: a
more-specific that will still cause you to route traffic to their
leaked prefix out to the Internet (and presumably, to their network.)

There is nothing good about this.  It's just adding unnecessary
complexity for no operational benefit.  There is bad about it.  It
adds complexity and risk.  What is that risk?  If you decide that the
Cymru distributed bogon route-server is for you, and simply rewrite
next-hops received on that session to Null0, it is possible that Cymru
could make an error, or otherwise introduce non-bogon routes into your
network as if they were bogons, causing black-holes.  This is
obviously too much to risk for something that has no operational
benefit.

The Cymru guys do many positive things.  One of the more questionable
things they do, though, is operate a route-server with the intention
of black-holing botnet CC IPs on a very wide scale.  This is
certainly a positive thing to do, but it was not done in a transparent
manner; and in fact didn't even have management approval at Cogent
when they configured it on their network.  There was no established
channel to find out why your IP address appeared on this list or to
get it removed.  All it took for me to get the whole idea canned at
Cogent was one inquiry to management, asking why engineers had quietly
started using a clandestine blackhole list operated by a third-party
and would not give any answers to a customer if one of their IPs
appeared on that list.  The IP address I inquired about was certainly
not a botnet CC node, and how it ended up on that list is a mystery.
I'm not saying there was any malicious intent, but it was a mistake at
least.

Trusting that bogon black-hole list to do something you don't even
need to do anyway is not smart.  It's *especially* not smart for some
novice who doesn't understand the implications of his decision.  This
is the danger of cut  paste engineering.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: iCloud - Is it going to hurt access providers?

2011-09-04 Thread Jeff Wheeler

On Sun, Sep 4, 2011 at 4:45 PM, Wayne E Bouchard w...@typo.org wrote:
 Okay, so to state the obvious for those who missed the point...

 The congestion will either be directly in front of user because
 they're flooding their uplink or towards the destination (beit a
 single central network or a set of storage clusters housed at, say, 6
 different locations off 3 different providers.) It is very hard, in my

If scaling up Internet bandwidth were the hardest thing about
deploying SaaS / cloud services, don't you think transit vendors
would suddenly be more profitable than EMC and friends?  It should be
obvious to you, and everyone else, that datacenter Internet
connectivity is a trivial concern compared to everything else that
goes into these platforms.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Deploying IPv6 Responsibly

2011-08-19 Thread Jeff Wheeler

On Fri, Aug 19, 2011 at 12:59 PM, Frank Bulk frnk...@iname.com wrote:
 I just noticed that the quad-A records for both those two hosts are now
 gone.  DNS being what it is, I'm not sure when that happened, but our
 monitoring system couldn't get the  for www.qwest.com about half an hour
 ago.

 Hopefully CenturyLink is actively working towards IPv6-enabling their sites
 again.

I hope that they aren't.  It doesn't help anyone for Qwest/CenturyLink
to publish  records or otherwise activate IPv6 services if they
have no system for monitoring their single most publicly-visible
service, no mechanism for alerting engineers or system administrators
of trouble, no way to act on problem reports generated by users after
*ten days*, and apparently no ability to actually fix the problem in a
timely manner when someone with a clue finally realized what was going
on.

Let's not encourage Qwest, or anyone else, to deploy any more IPv6
services until they get a few things in order first.  Simply turning
the  record back on before major, systemic oversights within the
organization are fixed would be irresponsible.  It will not help IPv6
progress, it will hurt it.

Every other network should keep this in mind as well.  If you can't
support your IPv6 services, don't deploy them for public use yet!
This doesn't mean don't work on it, but if your tech support staff
don't know how to handle calls, if the workstations in your
call-center don't have IPv6, if you haven't trained every person on
the escalation tree -- publishing an  record for www.foo.com is a
pretty stupid thing to do.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: OSPF vs IS-IS

2011-08-12 Thread Jeff Wheeler

I thought I'd chime in from my perspective, being the head router
jockey for a bunch of relatively small networks.  I still find that
many routers have support for OSPF but not IS-IS.  That, plus the fact
that most of these networks were based on OSPF before I took charge of
them, in the absence of a compelling reason to change to another IGP,
keeps me from taking advantage of IS-IS.  I'd like to, but not so
badly that I am willing to work around those routers without IS-IS, or
weight that feature more heavily when purchasing new equipment.

There are many routers with OSPF but no IS-IS.  I haven't seen any
with IS-IS but no OSPF.  I don't think such router would be very
marketable to most non-SP networks.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 end user addressing

2011-08-10 Thread Jeff Wheeler

On Wed, Aug 10, 2011 at 6:55 AM, Alexander Harrowell
a.harrow...@gmail.com wrote:
 Thinking about the CPE thread, isn't this a case for bridging as a
 feature in end-user devices? If Joe's media-centre box etc would bridge
 its downstream ports to the upstream port, the devices on them could
 just get an address, whether by DHCPv6 from the CPE router's delegation
 or by SLAAC, and then register in local DNS or more likely do multicast-
 DNS so they could find each other.

This would require the ISP gateway to have IPv6 ND entries for all of
the end-user's devices.  If that is only a few devices, like the
typical SOHO LAN today, that's probably fine.  It is not fine if I
purchase some IPv6-connected nanobots.  Given today's routers, it is
probably not even fine if the average SOHO goes from 1 state entry to
just 20 or 30.  I have about 20 devices in my home that use the
Internet -- TVs, DVRs, VoIP telephones, printer, mobile phones with
Wi-Fi, a couple of video game consoles, etc.  I imagine that is not
atypical these days.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 end user addressing

2011-08-10 Thread Jeff Wheeler

On Wed, Aug 10, 2011 at 2:03 PM, Owen DeLong o...@delong.com wrote:
 That said, /48 to the home should be what is happening, and /56 is
 a better compromise than anything smaller.

Is hierarchical routing within the SOHO network the reason you believe
/48 is useful?  You don't really imagine that end-users will require
more than 2^8 subnets, but that they will want several levels of very
simple, nibble-aligned routers within their network?

This is perhaps a good discussion to have.  I, for one, see CPE
vendors still shipping products without IPv6 support at all, let alone
any mechanism for creating an address or routing hierarchy within the
home without the end-user configuring it himself.  I am not aware of
any automatic means to do this, or even any working group trying to
produce that feature.

Is it true that there is no existing work on this?  If that is the
case, why would we not try to steer any such future work in such a way
that it can manage to do what the end-user wants without requiring a
/48 in their home?

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 end user addressing

2011-08-10 Thread Jeff Wheeler

On Wed, Aug 10, 2011 at 7:12 PM, Owen DeLong o...@delong.com wrote:
 Is it true that there is no existing work on this?  If that is the
 case, why would we not try to steer any such future work in such a way
 that it can manage to do what the end-user wants without requiring a
 /48 in their home?

 No, it is not true.

Can you give any example of a product, or on-going work?  I have read
two posts from you today saying that something either exists already,
or is being worked on.  I haven't read this anywhere else.

 I suppose that limiting enough households to too small an allocation
 will have that effect. I would rather we steer the internet deployment
 towards liberal enough allocations to avoid such disability for the
 future.

 Have we learned nothing from the way NAT shaped the (lack of)
 innovation in the home?

I am afraid we may not have learned from exhausting IPv4.  If I may
use the Hurricane Electric tunnel broker as an example again,
supposing that is an independent service with no relation to your
hosting, transit, etc. operations, it can justify a /24 allocation
immediately under 2011-3, without even relying on growth projections.
That's a middle ground figure that we can all live with, but it is
based on you serving (at this moment) only 8000 tunnels at your
busiest tunnel gateway.  If your tunnel gateways could serve 12,288 +
1 users each, then your /24 justification grows to a /20.  So you
would have a pretty significant chunk of the available IPv6 address
space for a fairly small number of end-users -- about 72,543 at
present.

It isn't hard to do some arithmetic and guess that if every household
in the world had IPv6 connectivity from a relatively low-density
service like the above example, we would still only burn through about
3% of the IPv6 address space on end-users (nothing said about server
farms, etc. here) but what does bother me is that the typical end-user
today has one, single IP address; and now we will be issuing them 2^16
subnets; yet it is not too hard to imagine a future where the global
IPv6 address pool becomes constrained due to service-provider
inefficiency.

I would like to have innovations in SOHO devices, too; who knows what
these may be.  But I fear we may repeat the mistake that caused NAT to
be a necessity in IPv4 -- exhausting address space -- by foolishly
assuming that every household is going to need twenty-four orders of
magnitude more public addresses than it has today.

That is what these practices do -- they literally give end-users
twenty-four orders of magnitude more addresses, while it is easy to
imagine that we will come within one order of magnitude of running
completely out of IPv6 addresses for issuing to service providers.

I didn't know what the digit 1 followed by twenty-four zeroes was
called.  I had to look it up.  So our end-users will be receiving
about one-Septillion addresses to use in their home, but no one seems
to be asking what future technology we may be harming by possibly
constraining the global address pool.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 end user addressing

2011-08-10 Thread Jeff Wheeler

On Wed, Aug 10, 2011 at 8:40 PM, Mark Andrews ma...@isc.org wrote:
 No.  A typical user has 10 to 20 addresses NAT'd to one public address.

I'd say this is fair.  Amazingly enough, it all basically works right
with one IP address today.  It will certainly be nice to have the
option to give all these devices public IP addresses, or even have a
few public subnets; but it does require more imagination than any of
us have demonstrated to figure out how any end-user will need more
than 2^8 subnets.  That's still assuming that device-makers won't
decide they need to be able to operate with subnets of arbitrary size,
rather than fixed-size /64 subnets.

 There was a concious decision made a decade and a half ago to got to
 128 bits instead of 64 bits and give each subnet 64 bits so we would
 never have to worry about the size of a subnet again.  IPv6 is about
 managing networks not managing addresses.

Thanks for the explanation of how to subnet IPv4 networks and use
RFC1918.  I hope most readers are already familiar with these
concepts.  You should note that IPv6 was not, in fact, originally
envisioned with /64 subnets; that figure was to be /80 or /96.  In the
mid-1990s, it was believed that dramatically increasing the number of
bits available for ISP routing flexibility was very beneficial, as
well as making access subnets so big that they should never need to
grow.  Then SLAAC came along.  Except SLAAC doesn't do necessary
things that DHCPv6 does, and the cost of implementing things like
DHCPv6 in very small, inexpensive devices has gone down dramatically.

I am amazed that so few imagine we might, in within the lifetime of
IPv6, like to have more bits of address space for routing structure
within ISP networks; but these people do think that end-users need
1.2e+24 addresses for the devices they'll have in their home.

I don't have to use my imagination to think of ways that additional
bits on the network address side would have been advantageous -- all I
need is my memory.  In the 90s, it was suggested that a growing number
of dual-homed networks cluttering the DFZ could be handled more
efficiently by setting aside certain address space for customers who
dual-homed to pairs of the largest ISPs.  The customer routes would
then not need to be carried by anyone except those two ISPs, who are
earning money from the customer.  This never happened for a variety of
good reasons, but most of the technical reasons would have gone away
with the adoption of IPv6, as it was envisioned in the mid-90s.

There seems to be a lot of imagination being used for SOHO networks,
and none on the ISP side.  What a shame that is.

Owen, I do agree with the point you made off-list, that if huge
mistakes are made now and the IPv6 address space is consumed more
rapidly than the community is comfortable with, there should be plenty
of opportunity to fix that down the road.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 end user addressing

2011-08-07 Thread Jeff Wheeler

On Sat, Aug 6, 2011 at 7:26 PM, Owen DeLong o...@delong.com wrote:
 Well, you aren't actually doing this on your network today.  If you
 practiced what you are preaching, you would not be carrying aggregate
 routes to your tunnel broker gateways across your whole backbone.

 Yes we would.

No, if you actually had a hierarchical addressing scheme, you would
issue tunnel broker customer /64s from the same aggregate prefix that
is used for their /48s.  You'd then summarize at your ABRs so the
entire POP need only inject one route for customer addressing into the
backbone.  Of course, this is not what you do today, and not because
of limited RIR allocation policies -- but because you simply did not
design your network with such route aggregation in mind.

 Those are artifacts of a small allocation (/32) from a prior RIR policy.
 The fact that those things haven't worked out so well for us was one of
 the motivations behind developing policy 2011-3.

There was nothing stopping you from using one /48 out of the /37s you
use to issue customer /48 networks for issuing the default /64 blocks
your tunnel broker hands out.

 We give a minimum /48 per customer with the small exception that
 customers who only want one subnet get a /64.

You assign a /64 by default.  Yes, customers can click a button and
get themselves a /48 instantly, but let's tell the truth when talking
about your current defaults -- customers are assigned a /64, not a
/48.

 We do have a hierarchical addressing plan. I said nothing about routing,
 but, we certainly could implement hierarchical routing if we arrived at a
 point where it was advantageous because we have designed for it.

How have you designed for it?  You already missed easy opportunities
to inject fewer routes into your backbone, simply by using different
aggregate prefixes for customer /64s vs /48s.

 However, requesting more than a /32 is perfectly reasonable. In
 the ARIN region, policy 2011-3.

 My read of that policy, and please correct me if I misunderstand, is
 that it recognizes only a two-level hierarchy.  This would mean that
 an ISP could use some bits to represent a geographic region, a POP, or
 an aggregation router / address pool, but it does not grant them
 justification to reserve bits for all these purposes.


 While that's theoretically true, the combination of 25% minfree ,
 nibble boundaries, and equal sized allocations for all POPs based
 on your largest one allows for that in practical terms in most
 circumstances.

I don't think it does allow for that.  I think it requires you to have
at least one POP prefix 75% full before you can get any additional
space from the RIR, where 75% full means routed to customers, not
reserved for aggregation router pools.  This is not a hierarchy, it is
simply a scheme to permit ISPs to bank on having at least one level of
summarization in their addressing and routing scheme.

2011-3 does not provide for an additional level to summarize on the
aggregation routers themselves.  It should, but my read is that the
authors have a very different opinion about what hierarchical
addressing means than I do.  It should provide for route aggregation
on both the ABR and the aggregation router itself.

 ATT serves some entire states out of a single POP, as far as layer-3
 termination is concerned.


 Are any of the states with populations larger than Philadelphia among
 them?

Yes, for example, Indiana.  Pretty much every state in the former
Ameritech service territory.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 end user addressing

2011-08-07 Thread Jeff Wheeler

On Sun, Aug 7, 2011 at 6:58 PM, Mark Andrews ma...@isc.org wrote:
 So you want HE to force all their clients to renumber.

No.  I am simply pointing out that Owen exaggerated when he stated
that he implements the following three practices together on his own
networks:
* hierarchical addressing
* nibble-aligned addressing
* /48 per access customer

You can simply read the last few messages in this thread to learn that
his recommendations on this list are not even practical for his
network today, because as Owen himself says, they are not yet able to
obtain additional RIR allocations.  HE certainly operates a useful,
high-profile tunnel-broker service which is IMO a very great asset to
the Internet at-large; but if you spend a few minutes looking at the
publicly available statistics on this service, they average only
around 10,000 active tunnels across all their tunnel termination boxes
combined.  They have not implemented the policies recommended by Owen
because, as he states, a /32 is not enough.

Do I think the position he advocates will cause the eventual
exhaustion of IPv6?  Well, let's do an exercise:

There has been some rather simplistic arithmetic posted today, 300m
new subnets per year, etc. with zero consideration of address/subnet
utilization efficiency within ISP networks and individual aggregation
router pools.  That is foolish.  We can all pull out a calculator and
figure that 2000::/3 has space for 35 trillion /48 networks.  That
isn't how they will be assigned or routed.

The effect of 2011-3 is that an out-sized ISP like ATT has every
justification for deciding to allocate 24 bits worth of subnet ID for
their largest POP, say, one that happens to terminate layer-3
services for all customers in an entire state.  They then have policy
support for allocating the same sized subnet for every other POP, no
matter how small.  After all, the RIR policy permits them to obtain
additional allocations as soon as one POP subnet has become full.

So now you have a huge ISP with a few huge POPs, and a lot of small
ones, justified in assigning the same size aggregate prefix, suitable
for 2^24 subnets, to all those small POPs as well.  How many layer-3
POPs might this huge ISP have?  Any number.  It could be every central
office with some kind of layer-3 customer aggregation router.  It
could even be every road-side hut for FTTH services.  Perhaps they
will decide to address ten thousand POPs this way.

Now the nibble-aligned language in the policy permits them to round up
from 10,000 POPs to 16 bits worth of address space for POP ID.  So
ATT is quite justified in requesting:
48 (customer subnet length) - 24 (largest POP subnet ID size) - 16
(POP ID) == a /8 subnet for themselves.

Now you can see how this policy, and addressing scheme, is utterly
brain-dead.  It really does put you (and me, and everyone else) in
real danger of exhausting the IPv6 address space.  All it takes is a
few out-sized ISPs, with one large POP each and a bunch of smaller
ones, applying for the maximum amount of address space permitted them
under 2011-3.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 end user addressing

2011-08-06 Thread Jeff Wheeler

On Sat, Aug 6, 2011 at 5:21 AM, Owen DeLong o...@delong.com wrote:
 At least don't make your life miserable by experimenting with too many 
 different assignment sizes,
 or advocate /64s or something, that's considered a design fault which will 
 come back to you some day.
 Read the RfCs and RIR policy discussions in the archives some years ago.

Note that in this thread, you advocate three things that are a little
tough to make work together:
* hierarchical addressing plan / routing
* nibble-aligned addressing plan
* minimum /48 per customer

If I were, for example, a hosting company with IPv6 terminated at the
layer-3 ToR switch, I would then use a /40 per rack of typical
dedicated servers.  If you then want some bits to be a POP-locator
field for your hierarchical routing scheme, you are already forced to
request more than a /32.  The number of customers per layer-3 device
for typical end-user access networks was around the same into the
late-1990s/early-2000s, as ISPs had racks of Portmasters or whatever
box of choice for terminating dial-up.

Densities have changed, but this doesn't necessarily win you an
advantage when combining those three properties.  This is especially
true if you consider that density may change in a difficult-to-predict
manner in the future -- a BRAS box with a couple thousand customers
today might have three times as many in a couple of years (IPv6 is
supposed to help us avoid renumbering or injecting additional routes
into our network, right?)  As an access provider, if I shared your
view, I would be reserving a /36 or /32 per BRAS box.  If I then want
some additional bits for hierarchical routing ... I'm going to need a
pretty large address block for perhaps a pretty small number of
customers.  After all, my scheme, applying your logic, dictates that I
should use a /32 or perhaps a /28 per each POP or city (I need to plan
for several BRAS each), even if I don't have a lot of customers today!

I think /56 is more sensible than /48, given the above, for most
end-users.  Either way, the users will be gaining a lot more
flexibility than they have with IPv4 today, where they probably get
just one IP address and have to pay a fee for any extras.  Giving the
typical end-user 8 fewer bits worth of address space allows the ISP
network more flexibility for hierarchical routing before they have to
go to their RIR and figure out how to justify an out-sized allocation.

Also, if folks would stop thinking that every subnet should be a /64,
they will see that end-users, makers of set-top-gateways, or whatever,
can certainly address a whole lot of devices in a whole lot of subnets
even if the user is only given a /64.  Do we think DHCPv6 won't be the
most common way of assigning addresses on SOHO LANs, and that SLAAC
will be essential?  I, for one, think that some ISPs will be sick and
twisted enough to hand out /128s so they can continue charging for
more IP addresses; but certainly the makers of IPv6-enabled devices
will foresee that end-user LANs might not be /64 and include the
necessary functionality to work correctly with smaller subnets.

Before you beat me to it, yes, we seem to have completely opposing
views on this subject.  I will change my mind when I can go to the RIR
and get a IPv6 /24 for a small ISP with a few POPs and a few tens of
thousands of customers.  Should RIR policy permit that sort of thing?

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 end user addressing

2011-08-06 Thread Jeff Wheeler

On Sat, Aug 6, 2011 at 12:36 PM, Owen DeLong o...@delong.com wrote:
 On Aug 6, 2011, at 3:15 AM, Jeff Wheeler wrote:
 Note that in this thread, you advocate three things that are a little
 tough to make work together:
 * hierarchical addressing plan / routing
 * nibble-aligned addressing plan
 * minimum /48 per customer

 Hasn't been hard so far.

Well, you aren't actually doing this on your network today.  If you
practiced what you are preaching, you would not be carrying aggregate
routes to your tunnel broker gateways across your whole backbone.
Perhaps you also wouldn't use one allocation on the tunnel broker
gateway for /64s and another, a /37 in the case of Ashburn for
example, for users who self-provision a /48 from it.  Also, of course,
your default assignment plan is /64, not /48, even though there are
typically around, what, 10k tunnels active network-wide?

To be clear, you don't do any of the three things you advocate above
where Hurricane Electric's tunnel broker service is concerned.

 I think we were talking about access customers. I don't see giving /48s
 to individual dedicated servers as I don't see them having hierarchy
 behind them. I would think that a /48 per TOR switch would be
 reasonable in that case.

I wish there was more discussion about IPv6 addressing plans in
hosting environments on this list.  I think that, rarely, customers
will decide to tunnel from their home or office to their dedicated
server, co-lo rack, etc.  My addressing policies provide for this
type of customer to receive a /56 upon request without breaking my
hierarchical addressing scheme.  If they need more than that, the
layer-3 aggregator has to inject an additional route into the POP
area.  If a whole bunch of customers on one aggregator ask for /56,
then the aggregator needs an extra /48 (which might really mean
growing its existing /48 to a shorter route.)

 However, requesting more than a /32 is perfectly reasonable. In
 the ARIN region, policy 2011-3.

My read of that policy, and please correct me if I misunderstand, is
that it recognizes only a two-level hierarchy.  This would mean that
an ISP could use some bits to represent a geographic region, a POP, or
an aggregation router / address pool, but it does not grant them
justification to reserve bits for all these purposes.

 density, even in 20 years. I realize that customer density in
 urban areas does tend to increase, but, assuming a maximum
 50% market penetration, serving a city the size of Philadelphia
 out of a single POP still seems unlikely to me.

ATT serves some entire states out of a single POP, as far as layer-3
termination is concerned.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: [lisp] Anybody can participate in the IETF (Was: Why is IPv6 broken?)

2011-07-18 Thread Jeff Wheeler

On Mon, Jul 18, 2011 at 12:15 PM, Noel Chiappa j...@mercury.lcs.mit.edu wrote:
 Let me make sure I understand your point here. You don't seem to be
 disagreeing with the assertion that for most sites (even things like very
 large universities, etc), their 'working set' (of nodes they communicate)
 with will be much smaller than the network as a whole?

Why would you assume this to be true if LISP also promises to make
multi-homing end-sites cheaper and easier, and independent of the
ISP's willingness to provide BGP without extra cost?  You see, if
every SOHO network and power user can suddenly become multi-homed
without spending a great deal of money on a powerful router and ISP
services which support BGP, many of these networks will do so.

The working sets of a scaled-up, LISP future will make the BGP DFZ of
today look small.

 So only the very largest content providers (YouTube, etc) will have
 'working sets' which include a fairly large share of the entire Internet?

No, any end-site of interest to a DoS attacker must be able to deal
with a working set which includes the entire Internet.  The reason for
this is obvious: it will be the best way to attack a LISP
infrastructure, and it will not be difficult for attackers to send
packets where each packet's source address appears to be from a
different mapping system entry.

Some people have commented that LISP hopes to prevent source address
spoofing through technical means that have not been fully explored.
This is a good goal but it must require the ETR doing address
validation to look-up state from the mapping system.  It will have the
same cache churn problem as an ITR subject to a reflection attack (or
an outbound DoS flow meant to disable that ITR.)

So there is no practical means of doing source address validation on
ETRs (under DoS.)  Even if you did that, the ITR must still be subject
to the occasional large flow of outbound traffic from a compromised
host (dorm machine, open wireless, hacked server, etc.) which is
intended to disable the ITR.

 I have previously commented that such sites have lots of specialized
 infrastructure to handle their traffic loads - do you think it will be
 infeasible for them to have specialized LISP infrastructure too? (Leaving
 aside for a moment what that infrastruture would look like - it's not
 necessarily separate hardware, it might be integrated into existing boxes
 on the periphery of their site.)

Again, every content shop will need to have that specialized
infrastructure.  Every site that someone might have a motive to launch
a DoS attack against must be able to withstand at least trivial DoS.
If you think only the super-huge sites will have a large working set,
you are again ignoring DoS attacks.

The same is true of ISP subscriber access platforms.  If my ISP's BRAS
effectively goes down regularly, I won't keep that ISP service very
long, I'll change to a competitor.  The more subscribers on one BRAS,
the more likely it will receive frequent DoS attacks.

So in reality, the common cache size needed to achieve a high hit rate
really does not matter, unless you wish to ignore DoS (which you seem
to want to do very badly.)

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

_
NANOG mailing list
NANOG@nanog.org
https://mailman.nanog.org/mailman/listinfo/nanog

Re: NDP DoS attack (was Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?))

2011-07-17 Thread Jeff Wheeler

On Sun, Jul 17, 2011 at 11:42 AM, William Herrin b...@herrin.us wrote:
 My off-the-cuff naive solution to this problem would be to discard the
 oldest incomplete solicitation to fit the new one and, upon receiving
 an apparently unsolicited response to a discarded solicitation,
 restart the process flagging that particular query non-discardable.

Do you mean to write, flagging that ND entry non-discardable?  Once
the ND entry is in place, it should not be purged for quite some time
(configurable is a plus), on the order of minutes or hours.  Making
them permanent would, however, cause the ND table to eventually
become full when foolish things like frequent source address changes
for privacy are in use, many clients are churning in and out of the
LAN, etc.

 Where does this naive approach break down?

It breaks down because the control-plane can't handle the relatively
small number of punts which must be generated in order to send ND
solicits, and without the ability to install incomplete entries into
the data-plane, those punts cannot be policed without, by design,
discarding some good punts along with the bad punts resulting from
DoS traffic.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

_
NANOG mailing list
NANOG@nanog.org
https://mailman.nanog.org/mailman/listinfo/nanog

Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)

2011-07-17 Thread Jeff Wheeler

On Sun, Jul 17, 2011 at 11:07 AM, Eliot Lear l...@cisco.com wrote:
 We all make mistakes in not questioning our own positions, from time to
 time.  You, Jeff, seem to be making that very same mistake.

 Rome wasn't built in a day.  The current system didn't come ready-made
 pre-built with all the bells and whistles you are used to.  It grew slowly
 over time, as we learned what works, what doesn't, and what was missing.
 Any system that attempts to deal with locator/id separation will assuredly
 not be built in a day, either.

LISP work has been going on for a long time to still not have any
useful discussion on a designed-in, trivial DoS which will affect any
ITR and make the work being done to allow ETRs to validate source
addresses (or even do loose uRPF) into a DoS vector for ETRs as well.

 While you have stated a problem relating to a security consideration –
 specifically that there is a potential reflection attack that could cause
 cache thrashing, the solution may not be what you expect.

I agree, a solution might be available.  One has not been presented
yet.  In my earliest postings to the IETF LISP list, the ones which
received zero replies, I suggest a way to significantly improve the
cache churn DoS problem.  It is not novel, as Darrel Lewis informed
me, which means that even already-available research has not been
applied to LISP in this area, and the Mapping Service protocol ties
the hands of implementors so they *cannot* apply such techniques while
still conforming to the specifications.

 Yes, you were asked.  Even so... Novelty isn't something worth arguing over,
 except in patent battles.

Really?  Novelty, by definition, advances the state of the art.  You
may not think it's very important to inform people that LISP is based
on essentially the same flow-caching scheme used in the 1990s, but I
do.

 Never is a very long time.  Many uses of never have been used relating to
 the Internet.  It is the corollary to Imminent Death of the 'Net: film @
 11.  I still have the NANOG tee-shirt with Robert Metcalfe, someone with
 considerably more notoriety, eating his hat.

And yet, I am quite comfortable with the statement that LISP can never
scale up to meet the demands of the Internet.  Perhaps with
fundamental changes to its design, and its advocates giving up some of
their current assumptions, some progress could be made.  In its
current form, though, LISP will never be a useful tool to scale the
Internet, and in fact, it cannot meet the demands of today's Internet.
 Unless, of course, you pretend that the ability to DoS any router
with a trivial amount of traffic is not worthy of concern.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

_
NANOG mailing list
NANOG@nanog.org
https://mailman.nanog.org/mailman/listinfo/nanog

Re: NDP DoS attack (was Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?))

2011-07-17 Thread Jeff Wheeler

On Sun, Jul 17, 2011 at 3:40 PM, Owen DeLong o...@delong.com wrote:
 Basically an ND entry would have the following states and timers:

I've discussed what you have described with some colleagues in the
past.  The idea has merit and I would certainly not complain if
vendors included it (as a knob) on their boxes.  The downfalls of this
approach are that they still don't ensure the discovery of new
neighbors (rather than ever seen neighbors) during DoS, and you make
the local DoS a bit more complex by needing to establish more rules
for purging these semi-permanent entries.

 I think most of this punting could be handled at the line card level. Is there
 any reason that the ND process can't be moved into line-card level silicon
 as described above?

You could implement ND solicit in the data-plane (and remove punts
entirely) in even some current chips, to say nothing of future ones.
Whether or not that is a good idea, well, keep in mind that the ND
solicits would then be mcasted to the LAN at a potentially unlimited
rate.

That is not necessarily a problem unless the L2 implementation is not
too good with respect to multicast.  For example, in some switches
(mostly those that are routers that can switch) the L2 mcast has
surprising caveats, such as using up a lot of fabric capacity for
whatever replication scheme has been chosen.

Of course, you also hope NDP on all the connected hosts works right.
I believe some Juniper customers noticed a pretty big problem with
JUNOS NDP implementation when deploying boxes using the DE-CIX
addressing scheme, and in a situation like that, the ingress router
for the attack could be crippled by spurious responses from the other
mis-behaving hosts on the LAN, essentially like smurf except without
sending any garbage back out to the Internet.

What you definitely don't want to do is assume this fixes the local
DoS, because it doesn't.  I would like for you to keep in mind that a
host on the LAN, misconfigured to do something like local proxy-arp,
or otherwise responding to all ND solicits, would accidentally DoS the
LAN's gateway.  I do not think we should assume that the local DoS
won't happen, or is fixable with a whack-a-mole method.

 Sure, that doesn't solve the problem on current hardware, but, it moves it
 from design problem to implementation issue, which IMHO is a step in the
 right direction.

Well, it already is a design problem that implementations can largely
work-around.  Vendors just aren't doing it.  :-/

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

_
NANOG mailing list
NANOG@nanog.org
https://mailman.nanog.org/mailman/listinfo/nanog

Re: in defense of lisp (was: Anybody can participate in the IETF)

2011-07-13 Thread Jeff Wheeler

On Wed, Jul 13, 2011 at 2:27 AM, Randy Bush ra...@psg.com wrote:
 I fear that at its worst and most successful, LISP ensures ipv4 is the
 backbone transport media to the detriment of ipv6 and at its best, it
 is a distraction for folks that need to be making ipv6 work, for real.

 i suspect that a number of lisp proponents are of that mind.  i do not
 think it does a service to the internet.

My understanding is that transport over v6 is indeed on everyone's
mind and absolutely is a goal for all the LISP people.  So on this
particular point, your concern is being addressed.

What LISP has not done is actually improve the root problem of scaling
up the number of multi-homed networks or locators.  The cache scheme
works if you imagine an ideal Internet where there is no DoS, but
otherwise, it does not work.  All the same problems of flow-cache
routing still exist and LISP actually makes them worse in some cases,
not better.  It also adds huge complexity and risk but what value it
adds (outside of VPN-over-Internet) is questionable at best.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)

2011-07-13 Thread Jeff Wheeler

Luigi, you have mis-understood quite a bit of the content of my
message.  I'm not sure if this is of any further interest to NANOG
readers, but as it is basically what seems to go on a lot, from my
observations of IETF list activity, I'll copy my reply to the list as
you have done.

On Wed, Jul 13, 2011 at 4:08 AM, Luigi Iannone
lu...@net.t-labs.tu-berlin.de wrote:
 Granted. You are the real world expert. Now can you stop repeating this in
 each email and move on?

No.  This is a point that needs to be not only made, but driven home.
You do not understand how routers work, which is why you are having
such difficulty understanding the severity of this problem.  The
lisp-threats work you have done is basically all control-plane /
signalling issues, and no data-plane issues.  This is not a
coincidence; it is because your knowledge of the control-plane side is
good and of the data-plane is weak.

 This is completely false. Several people gave credit to you about the
 existence of the threat you pointed out.

Really?  In April, when I posted a serious problem, and received no
replies?  Now, the original folks who I discussed this with, before
ever posting to the IETF LISP list, are finally seeking clarification,
because apparently there may have been some confusion in April,
possibly leading to their total dismissal of this as a practical
concern.

 This is again false. We had mail exchange both privately and on the
 mailinglist. We proposed to you text to be added to the threats draft but
 you did not like it. We are asking to propose text but we have no answer
 from you on this point.

Actually, you classified this as an implementation concern, which is
false.  You have said yourself that this is why you believe it
deserves just one sentence, if that, in the lisp-threats draft.  This
is not an implementation-specific concern, it is a design flaw in the
MS negative response scheme, which emerges to produce a trivial DoS
threat if LISP ever scales up.

 Now there is a LISP threats draft which the working group mandates
 they produce, discussing various security problems.  The current paper
 is a laundry list of what if scenarios, like, what if a malicious
 person could fill the LISP control-plane with garbage.  BGP has the

 So you are saying that BGP can be victim of similar attacks/problem
 still... if you are reading this email it means that the Internet is still
 running...

This is where I believe you are mis-reading my message.  Your threats
draft covers legitimate concerns which also exist in the current
system that is widely deployed, which is largely, BGP plus big FIB.
What you don't cover, at all, is an IMO critical new threat that
emerges in the data-plane from the design of the MS protocol.

 If you still think that LISP is using a flow-cache you should have a second
 read to the set of drafts.

This language may appear unclear if you haven't read it in the context
of my other postings.  LISP routing most certainly is a flow-cache,
however, the definition of flow is different.  Some platforms and
routing schemes see a flow as a layer-3 destination /32 or similar
(some 90s routers), others more granular (firewalls, where flows are
usually layer-4 and often stateful), and with LISP, the flow the
address space routed from your ITR to a remote ETR, which may cover a
large amount of address space and many smaller flows.

The LISP drafts also refer to these flows as tunnels, but that
language could easily be confused to mean much more permanent, static
tunnels, or MPLS-like tunnels which are signaled throughout the
network of P routers.  So there are clear semantic issues of
importance when talking about LISP, and all these terms must be read
in the correct context.

 For the third time: this is false. We got the problem, we were asking for
 more specific information in order to quantify the risk. We asked you help

You haven't got it, or you would already understand the risk very
well.  It is not my intention to fault you and your colleagues for
failing to understand this; but to demonstrate clearly that the right
kind of expertise is absolutely not being applied to LISP, and there
is a huge and possibly intractable threat that was completely
overlooked when producing what is meant to be an authoritative
document on currently-known threats to LISP.

 to state the problem and explained to you where the solution should be
 addressed. But you seem to be stuck on the operator vs. researcher
 discussion, which IMHO is just pointless.

Substantially all operators are stuck there.  They should participate more.

 Let me now ask a simple question: why are you so strongly against LISP?

No new work has been done to address the problem of scaling up the
number of locators or multi-homed end-sites.  However, the *claims*
being made by LISP advocates is that the caching scheme you have,
which is not novel, does solve this problem.  It does not.  It cannot
as there has been no novel work on this.

It is very

Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)

2011-07-12 Thread Jeff Wheeler

On Tue, Jul 12, 2011 at 11:42 AM, Leo Bicknell bickn...@ufp.org wrote:
 I'll pick on LISP as an example, since many operators are at least
 aware of it.  Some operators have said we need a locator and identifier
 split.  Interesting feedback.  The IETF has gone off and started
 playing in the sandbox, trying to figure out how to make that go.

As an operator (who understands how most things work in very great
detail), I found the LISP folks very much uninterested in my concerns
about if LISP can ever be made to scale up to Internet-scale, with
respect to a specific DDoS vector.  I also think that an explosion of
small, multi-homed SOHO networks would be a disaster, because we might
have 3 million FIB instead of 360k FIB after a few years.  These
things are directly related to each-other, too.

So I emailed some LISP gurus off-list and discussed my concern.  I was
encouraged to post to the LISP IETF list, which I did.  To my great
surprise, not one single person was interested in my problem.  If you
think it is a small problem, well, you should try going back to
late-1990s flow-cache routing in your data-center networks and see
what happens when you get DDoS.  I am sure most of us remember some of
those painful experiences.

Now there is a LISP threats draft which the working group mandates
they produce, discussing various security problems.  The current paper
is a laundry list of what if scenarios, like, what if a malicious
person could fill the LISP control-plane with garbage.  BGP has the
same issue, if some bad guy had enable on a big enough network that
their peers/transits don't filter their routes, they could do a lot of
damage before they were stopped.  This sometimes happens even by
accident, for example, some poor guy accidentally announcing 12/9 and
giving ATT a really bad day.

What it doesn't contain is anything relevant to the special-case DDoS
that all LISP sites would be vulnerable to, due to the IMO bad
flow-cache management system that is specified.  I am having a very
great deal of trouble getting the authors of the threats document to
even understand what the problem is, because as one of them put it, he
is just a researcher.  I am sure he and his colleagues are very
smart guys, but they clearly do not remember our 1990s pains.

That is the not an operator problem.  It is understandable.

Others who have been around long enough simply dismiss this problem,
because they believe the unparalleled benefits of LISP for mobility
and multi-homing SOHO sites must greatly out-weigh the fact that,
well, if you are a content provider and you receive a DDoS, your site
will be down and there isn't a damn thing you can do about it, other
than spec routers that have way, way more FIB than the number of
possible routes, again due to the bad caching scheme.

The above is what I think is the ego-invested problem, where certain
pretty smart, well-intentioned people have a lot of time, and
professional credibility, invested in making LISP work.  I'm sure it
isn't pleasing for these guys to defend their project against my
argument that it may never be able to reach Internet-scale, and that
they have missed what I claim is a show-stopping problem with an easy
way to improve it through several years of development.  Especially
since I am a guy who did not ever participate in the IETF before,
someone they don't know from a random guy on the street.

I am glad that this NANOG discussion has got some of these LISP folks
to pay more attention to my argument, and my suggested improvement (I
am not only bashing their project; I have positive input, too.)
Simply posting to their mailing list once and emailing a few draft
authors did not cause any movement at all.  Evidently it does get
attention, though, to jump up and down on a different list.  Go
figure!

If operators don't provide input and *perspective* to things like
LISP, we will end up with bad results.

How many of us are amazed that we still do not have 32:32 bits BGP
communities to go along with 32 bit ASNs, for signalling requests to
transit providers without collision with other networks' community
schemes?  It is a pretty stupid situation, and yet here we are, with
32 bit ASN for years, and if you want to do advertisement control with
32 bit ASNs used, you are either mapping your 32 bit neighbors to
special numbers, or your community scheme can overlap with others.

That BGP community problem is pretty tiny compared to, what if people
really started rolling out something new and clever like LISP, but in
a half-baked, broken way that takes us back to 1990s era of small DDoS
taking out whole data-center aggregation router.  A lot of us think
IPv6 is over-baked and broken, and probably this is why it has taken
such a very long time to get anywhere with it.  But ultimately, it is
our fault for not participating.  I am reversing my own behavior and
providing input to some WGs I care about, in what time I have to do
so.  More operators should do the same.

Re: Why is IPv6 broken?

2011-07-11 Thread Jeff Wheeler

On Mon, Jul 11, 2011 at 3:25 AM, Tom Hill t...@ninjabadger.net wrote:
 On Sun, 2011-07-10 at 10:14 -0400, Jeff Wheeler wrote:
 Cogent's policy of requiring a new contract, and from what I am still
 being told by some European customers, new money, from customers in
 exchange for provisioning IPv6 on existing circuits, means a simple
 technical project gets caught up in the complexities of budgeting and
 contract execution.

 Can we have IPv6 transit?
 Yes, please turn up a session to..

 That was asking Cogent for IPv6 dual-stack on our existing IPv4
 transit.

I continue to hear different.  In my first-hand experience just about
three weeks ago, I was told by Cogent that I need to execute a new
contract to get IPv6 added to an existing IPv4 circuit (U.S.
customer.)  This turned a simple pilot project with only a few I.T.
folks involved into, well, I'm still waiting on this new contract to
be executed.  I'm not surprised.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)

2011-07-11 Thread Jeff Wheeler

On Mon, Jul 11, 2011 at 3:18 PM, William Herrin b...@herrin.us wrote:
 On the other hand, calling out ops issues in RFCs is a modest reform
 that at worst shouldn't hurt anything. That beats my next best idea:

I think if this were done, some guy like me would spend endless hours
arguing with others about what should and should not be documented in
this proposed section, without it actually benefiting the process or
the improving the underlying protocol function / specification.  Let
me give you an example:

BGP Messages, which are up to 4KB, need to be expanded to support
future features like as-path signing.  Randy Bush proposes to extend
them to 65,535 octets, the maximum size without significantly changing
the message header.  This raises a few concerns which I label as
operational, for example, off-by-one bugs in code can fail to be
detected by a neighboring BGP speaker in some circumstances, because
an age-old (since BGP 1) idiot check in the protocol is being silently
removed.

If you ask me, that is operational and belongs in such a section.  I'm
sure others will disagree.  So we would have a bunch of arguing over
whether or not to call this out specifically.

Another person believes that expanding the message will affect some
vendors' custom TCP stacks, due to window size considerations.  I
might think that is a developer problem and the affected vendors
should fix their crappy TCP implementations, but it might produce
unusual stalling problems, etc. which operators have to troubleshoot.
Is that an operational issue?  Should it be documented?

There can be many operational concerns when creating or modifying a
protocol specification, and every person won't agree on what belongs
and what doesn't.  However, I do not think the requirement to document
them will improve the process or the protocols.  It will only add
work.

Besides, you want IETF people who are claimed not to understand
operational problems to figure them out and document them in the RFCs?
 I do not think this will be helpful.  More hands-on operators
participating in their process is what is needed.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)

2011-07-11 Thread Jeff Wheeler

On Mon, Jul 11, 2011 at 3:35 PM, Leo Bicknell bickn...@ufp.org wrote:
 The IETF does not want operators in many steps of the process.  If
 you try to bring up operational concerns in early protocol development
 for example you'll often get a we'll look at that later response,
 which in many cases is right.  Sometimes you just have to play with
 something before you worry about the operational details.  It also

I really don't understand why that is right / good.  People get
personally invested in their project / spec, and not only that, vendor
people get their company's time and money invested in
proof-of-concept.  The longer something goes on with what may be
serious design flaws, the harder it is to get them fixed, simply
because of momentum.

Wouldn't it be nice if we could change the way that next-header works
in IPv6 now?  Or get rid of SLAAC and erase the RFCs recommending /80
and /64 from history?

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)

2011-07-11 Thread Jeff Wheeler

On Mon, Jul 11, 2011 at 5:12 PM, Owen DeLong o...@delong.com wrote:
 No... I like SLAAC and find it useful in a number of places. What's wrong
 with /64? Yes, we need better DOS protection in switches and routers

See my slides http://inconcepts.biz/~jsw/IPv6_NDP_Exhaustion.pdf for
why no vendor's implementation is effective DOS protection today and
how much complexity is involved in doing it correctly, which requires
not only knobs on routers, but also on layer-2 access switches, which
is not easy to implement.  It's a whole lot smarter to just configure
a smaller network when that is practical.  In fact, that advice should
be the standard.

I really don't understand why we need SLAAC.  I believe it is a relic
of a mindset when a DHCP client might have been hard to implement
cost-effectively in a really light-weight client device (coffee pot?
wrist-watch?)  Or when running a DHCP server was some big undertaking
that couldn't be made not only obvious, but transparent, to SOHO users
buying any $99 CPE.

I do understand why SLAAC needs /64.  Okay, so configure /64 on those
networks where SLAAC is utilized.  Otherwise, do something else.
Pretty simple!  Again, please see my slides.

 to accommodate some of the realities of those decisions, but, that's not
 to say that SLAAC or /64s are bad. They're fine ideas with proper
 protections.

The proper protections are kinda hard to do if you have relatively
dumb layer-2 access switches.  It is a lot harder than RA Guard, and
we aren't ever likely to see that feature on a large base of installed
legacy switches, like Cisco 2950.  Replacing those will be
expensive.  We can't replace them yet anyway because similar switches
(price) today still do not have RA Guard, let alone any knobs to
defend against neighbor table churn, etc.  I'm not sure if they ever
will have the later.

 I'm not sure about the /80 reference as I haven't encountered that
 recommendation outside of some perverse ideas about point-to-point
 links.

This is because you didn't follow IPv6 progress until somewhat
recently, and you are not aware that the original suggestion for
prefix length was 80 bits, leaving just 48 bits for the host portion
of the address.  This was later revised.  It helps to know a bit of
the history that got us to where we are now.

It was originally hoped, by some, that we may not even need NDP
because the layer-2 adjacency would always be encoded in the end of
the layer-3 address.  Some people still think vendors may get us to
that point with configuration knobs.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)

2011-07-11 Thread Jeff Wheeler

On Mon, Jul 11, 2011 at 7:48 PM, Jimmy Hess mysi...@gmail.com wrote:
 If every vendor's implementation is vulnerable to a NDP Exhaustion
 vulnerability,
 how come the behavior of specific routers has not been documented 
 specifically?

Well, I am in the business of knowing the behavior of kit being
considered by my clients for their applications.  Every box breaks
when tested, period.  I imagine you have tested zero, thus you have no
data of your own to go on.  No vendors are rushing to spend money on
independent testing laboratories to produce reports about this,
because they pretty much all know their boxes will break (or are not
even aware of the potential problem, in the case of a few scary
vendors.)

 If  zero devices are not vulnerable, you came to this conclusion
 because you tested
 every single implementation against IPv6 NDP DoS,  or?

Although I have tested many routers to verify my thinking, if you
actually read the slides and understand how routers work, you too will
know that every router is vulnerable.  If you don't know, you don't
understand how routers work.  It's that simple.

 How come there are no security advisories.
 What's the CWE or CVE number for this vulnerability?

Again, no one is interested in this problem yet because vendors really
don't want their customers to demand more knobs.  Cisco is the only
vendor who has done anything at all.  If you read about their knob,
you immediately realize that it is a knob to control the failure mode
of the box, not to fix anything.  Why?  It can't be fixed without
not using /64 (or similar) or going to the extreme lengths I outline
in those slides.

 It would be useful to at least have the risk properly described, in
 terms of what
 kind of DoS condition could arise on specific implementations.

Let's take 6500/SUP720 for example.  On this platform, a policer is
shared between the need to resolve ARP entries and ND table entries.
If you attack a dual-stack SUP720 box it will break not only IPv6
neighbor resolution, but also IPv4 neighbor resolution.  This is
pretty much the worst-case scenario because not only will your IPv6
break, which may annoy customers but not be a disaster; it will also
break mission-critical IPv4.  That's bad.  Routing-protocol
adjacencies can be affected, disabling not just some hosts downstream
of the box, but also its upstream connectivity.  It doesn't get any
worse than that.

You are right to question my statements.  I'm not an independent lab
doing professional tests and showing the environment and conditions of
how you can reproduce the results.  I'm just a guy helping my clients
decide what kit to buy, and how they should configure their networks.
The only reason I have bothered to produce slides is because we are at
a point where we have end-customers questioning our reluctance to
provision /64 networks for mixed-use data-center LANs, and until
vendors actually do something to address this, or the standard
changes, I need to increase awareness of this problem so I am not
forced to deploy a broken design on my own networks the way a lot of
other clueless people are.

Again, this is only hard to understand (or accept) if you don't know
how your routers work.
* why do you think there is an ARP and ND table?
* why do you think there are policers to protect the CPU from
excessive ARP/ND punts or traffic?
* do you even know the limit of your boxes' ARP / ND tables?  Do you
realize that limit is a tiny fraction of one /64?
* do you understand what happens when your ARP/ND policers are reached?
* did you think about the impact on neighboring routers and protocol
next-hops, not just servers?
* did you every try to deploy a /16 on a flat LAN with a lot of hosts
and see what happens?  Doesn't work too well.  A v6 /64 is 281
trillion times bigger than a v4 /16.  There's no big leap of logic
here as to why one rogue machine could break your LAN.

There is no router which is not vulnerable to this.  If you don't
believe me, read the Cisco documentation on their knob limiting ND
entries per interface, after which there may be service impact on that
interface.  That's the best anyone is doing right now.  Of course,
vendors understand that we, as customers, can configure a subnet
smaller than /64.  They are leaving us open to link-local issues right
now even with a smaller global subnet size, but at least that cannot
be exploited from the Internet.  And as it happens, exactly the same
features / knobs are needed to fix both problems with /64, and with
link-local neighbor learning.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Why is IPv6 broken?

2011-07-10 Thread Jeff Wheeler

On Sat, Jul 9, 2011 at 5:25 PM, Bob Network network...@hotmail.com wrote:
 Why is IPv6 broken?

You should have titled your thread, my own personal rant about
Hurricane Electric's IPv6 strategy.  You may also have left out the
dodgy explanation of peering policies and technicalities, since these
issues have been remarkably static since about 1996.  The names of the
networks change, but the song remains the same.  This is not a novel
subject on this mailing list.  In fact, there have been a number of
threads discussing HE's practices lately.  If you are so interested in
them, I suggest you review the list archive.

There are quite a few serious, unresolved technical problems with IPv6
adoption besides a few networks playing chicken with their collective
customer-bases.  The lack of will on the part of vendors and operators
to participate in the IETF process, and make necessary and/or
beneficial changes to the IPv6 standards, has left us in a situation
where IPv6 implementation produces networks which are vulnerable to
trivial DoS attacks and network intrusions.

The lack of will on the part of access providers to insist on
functioning IPv6 support on CPE and BRAS platforms has even mid-sized
ISPs facing nine-figure (as in, hundred-million-dollars) expenses to
forklift-upgrade their access networks and end-user equipment, at a
time when IPv6 seems to be the only way to continue growing the
Internet.

The lack of will on the part of major transit networks, including
Savvis, to deploy IPv6 capabilities to their customers, means that
customers caught in multi-year contracts may have no option for native
connectivity.  Cogent's policy of requiring a new contract, and from
what I am still being told by some European customers, new money, from
customers in exchange for provisioning IPv6 on existing circuits,
means a simple technical project gets caught up in the complexities of
budgeting and contract execution.

If you believe that the most serious problem facing IPv6 adoption is
that HE / Level3 / Cogent don't carry a full table, you are living in
a fantasy world.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Anybody can participate in the IETF (Was: Why is IPv6 broken?)

2011-07-10 Thread Jeff Wheeler

On Sun, Jul 10, 2011 at 3:45 PM, Owen DeLong o...@delong.com wrote:
 Number two: While anyone can participate, approaching IETF as an
 operator requires a rather thick skin, or, at least it did the last couple
 of times I attempted to participate. I've watched a few times where

I am subscribed to the IDR (BGP, etc.) and LISP lists.  These are
populated with different people and cover entirely different topics.
My opinion is the following:

* The IDR list is welcoming of operators, but whether or not your
opinion is listened to or included in the process, I do not know.
Randy Bush, alone, posts more on this list than the sum of all
operators who post in the time I've been reading.  I think Randy's
influence is 100% negative, and it concerns me deeply that one
individual has the potential to do so much damage to essential
protocols like BGP.  Also, the priorities of this list are pretty
fucked.  Inaction within this working group is the reason we still
don't have expanded BGP communities for 32 bit ASNs.  The reason for
this is operators aren't participating.  The people on the list or the
current participants of the WG should not be blamed.  My gripe about
Randy Bush having the potential to do huge damage would not exist if
there were enough people on the list who understand what they're doing
to offer counter-arguments.

 operators were shouted down by purists and religion over basic
 real-world operational concerns. It seems to be a relatively routine
 practice and does not lead to operators wanting to come back to
 an environment where they feel unwelcome.

I have found my input on the LISP list completely ignored because, as
you suggest, my concerns are real-world and don't have any impact on
someone's pet project.  LISP as it stands today can never work on the
Internet, and regardless of the fine reputations of the people at
Cisco and other organizations who are working on it, they are either
furthering it only because they would rather work on a pet project
than something useful to customers, or because they truly cannot
understand its deep, insurmountable design flaws at Internet-scale.
You would generally hope that someone saying, LISP can't work at
Internet-scale because anyone will be able to trivially DoS any LISP
ITR ('router' for simplicity), but here is a way you can improve it,
well, that remark, input, and person should be taken quite seriously,
their input examined, and other assumptions about the way LISP is
supposed to work ought to be questioned.  None of this has happened.
LISP is a pet project to get some people their Ph.D.s and keep some
old guard vendor folks from jumping ship to another company.  It is a
shame that the IETF is manipulated to legitimize that kind of thing.

Then again, I could be wrong.  Randy Bush could be a genius and LISP
could revolutionize mobility.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: ICANN to allow commercial gTLDs

2011-06-17 Thread Jeff Wheeler

On Sat, Jun 18, 2011 at 12:04 AM, George B. geor...@gmail.com wrote:
 I think I will get .payme  and make sure coke.payme, pepsi.payme,
 comcast.payme, etc. all get registered at the low-low price of
 $10/year.  All I would need is 100,000 registrations to provide me
 with a million dollar a year income stream for the rest of my life.

I have read this thread, but certainly not any ICANN garbage.  It
seems to me that a TLD for a brand, like Coca-Cola, would not be used
in the same way as GTLDs.  Will George actually be allowed to carve up
his own TLD and sell bits of it to anyone who is willing to click a
checkbox on GoDaddy.com?  Obviously there is not any technical
limitation in place to prevent this, but will there be legal / layer
9 limitations?

I kinda figured additional GTLDs is not very useful given that
probably every domain registrar drives customers to protect their
brand, avoid phishing attacks against their customers, etc. by buying
not only example.com, but also net|org|biz|etc.  I imagine that
registrars may be really excited about this idea, because it
represents additional fees/revenue to them.  I can't understand why it
is good for anyone else.  Does McDonald's really want to print
http://mcdonalds/ or www.mcdonalds instead of www.mcdonalds.com on
their soft drink cups and TV ads?

Is Owen so disconnected from reality that he thinks the chain with the
golden arches is spelled MacDonald's?

I don't particularly care about the intellectual property questions
(in the context of NANOG) but if you really want to bang your head
against that, I suggest reading about the current trademark status of
Standard Oil.  In short, it remains a legally protected mark but has
several distinct owners throughout the United States -- a result of
the break-up.  Waffle House is a little complex, too.  Somehow the
GTLD system continues to function.  I imagine the relevant authorities
are capable of figuring out who should be allowed to register which
brand-TLD.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Consequences of BGP Peering with Private Addresses

2011-06-16 Thread Jeff Wheeler

On Wed, Jun 15, 2011 at 12:47 PM, James Grace ja...@cs.fiu.edu wrote:
 So we're running out of peering space in our /24 and we were considering 
 using private /30's for new peerings.  Are there any horrific consequences to 
 picking up this practice?

I agree with other posters that this is not a good practice.  Is it
somehow not possible for you to obtain additional address space?  Can
you not use neighbor-assigned /30s more frequently to avoid exhausting
your existing allocation?

For eBGP neighbors, I would sooner use non-unique /30s than utilize
RFC1918 space.  While this would not allow for correct reverse DNS,
and traceroute would be less obvious, it has fewer disadvantages than
assigning RFC1918 for your peer link-nets.  You will need to re-write
next-hop towards iBGP neighbors, though (using next-hop-self or
translating to internal numbers for routing protocol use) and you
should not re-use the same /30 twice on the same ASBR.

This may sound crazy, and it is certainly not an ideal way of doing
things; but it is an alternative worth consideration as networks
exhaust their available IPv4.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Cogent IPv6

2011-06-09 Thread Jeff Wheeler

On Thu, Jun 9, 2011 at 8:50 AM, ML m...@kenweb.org wrote:
 I guess someone with a 1 Gb commit in a not so small city deserves to be
 charged extra for a few Mbps of IPv6...

 For a not so full table at that.

We canceled some 10GbE Cogent circuits because of Cogent's refusal to
provision IPv6 without adding extra fees, and I expressed my reasoning
well in advance of canceling the first one.  I have been told that
they have now eliminated the special fee for North American customers,
but just two weeks ago I heard about this IPv6 surcharge stupidity
still being applied to Cogent's customers in Europe.

If you want to change your vendor, sometimes you have to change your vendor.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

v6 transit swaps harmful

2011-06-07 Thread Jeff Wheeler

In case there are folks who missed this in the past few years, we will
soon be past the point where IPv6 transit swaps and other incubation
tools are acceptable to customers.  How is it that Tiscali and Sprint
can only get together via IIJ?  Who is to blame?  From my perspective,
all three networks.  I'll spare you the rest of my hand-waving and
just paste the route:

% host -t  www.sprint.net
www.sprint.net has IPv6 address 2600::

2600::/29
AS path: 3257 2497 6175 1239 1239 1239 1239 1239 1239 1239 I

% traceroute6 -q1 -f2 2600::
traceroute6 to 2600:: (2600::) from [redacted], 64 hops max, 12 byte packets
Skipping 1 intermediate hops
 2  xe-10-3-0.nyc20.ip6.tinet.net (2001:668:0:2::1:892)  10.896 ms
 3  2001:504:1::a500:2497:1 (2001:504:1::a500:2497:1)  13.511 ms
 4  sjc002bb01.iij.net (2001:48b0:bb00:8019::4008)  89.263 ms
 5  sjc002ix02.iij.net (2001:48b0:bb03:f::4015)  87.075 ms
 6  sl-bb1v6-sj-t-40.sprintv6.net (2001:440::ffcd::1)  92.491 ms
 7  sl-crs2-sj-po0-1-4-0.v6.sprintlink.net
(2600:0:2:1239:144:232:1:123)  89.333 ms
 8  sl-crs1-sj-po0-9-5-0.v6.sprintlink.net
(2600:0:2:1239:144:232:2:108)  95.966 ms
 9  sl-crs2-ria-po0-3-5-0.v6.sprintlink.net
(2600:0:2:1239:144:232:9:114)  97.788 ms
10  sl-crs2-fw-po0-13-2-0.v6.sprintlink.net
(2600:0:2:1239:144:232:25:160)  173.331 ms
11  sl-crs1-fw-po0-12-0-0.v6.sprintlink.net
(2600:0:2:1239:144:232:18:145)  165.577 ms
12  sl-crs3-fw-po0-7-0-0.v6.sprintlink.net
(2600:0:2:1239:144:232:1:45)  167.203 ms
13  sl-crs3-atl-po0-2-0-0.v6.sprintlink.net
(2600:0:2:1239:144:232:8:20)  169.195 ms
14  sl-crs1-atl-po0-11-0-0.v6.sprintlink.net
(2600:0:2:1239:144:232:4:48)  170.922 ms
15  sl-crs1-ffx-po0-8-0-0.v6.sprintlink.net
(2600:0:2:1239:144:232:18:119)  172.688 ms
16  sl-crs1-orl-po0-0-0-0.v6.sprintlink.net
(2600:0:2:1239:144:232:19:251)  177.762 ms
17  sl-lkdstr2-p1-0.v6.sprintlink.net (2600:0:3:1239:144:223:33:32)  177.450 ms
18  www.sprint.net (2600::)  172.235 ms

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv6 foot-dragging

2011-05-12 Thread Jeff Wheeler

On Thu, May 12, 2011 at 8:39 PM, Jimmy Hess mysi...@gmail.com wrote:
 A very important distinction. The _immediate_  hit to the DFZ might be
 the same as obtaining PI V6 space,
 but the _long term_ hit to the DFZ might be much greater;

The real issue is that there are many /48 announcements today which
should be either:
1) not in the DFZ at all, but are because of
  a) accidental pollution/leaks
  b) intentional de-aggregation, which is very often inappropriate
2) should instead be PI allocations to organizations, not delegated PA space

This will only get worse unless we task the RIRs with doing the only
real job they have left in a post-v6-transition world: working to
enable connectivity without unnecessary DFZ bloat.  There is no longer
a need for RIRs to say no to allocation requests on the basis that
we will run out of (IPv6) addresses.  The sole reason for technical
barriers in the application/request process at all is to keep the DFZ
in-check.  Yet, our community still refuses to explicitly alter RIR
policy such that controlling DFZ growth is an explicit component of
the RIRs' mission.

We can very easily choose to have one of two scenarios:
1) The bad situation with IPv4, where half the DFZ is accidental leaks
or poorly-designed networks that are essentially on auto-pilot; yet
small businesses and ISPs are not able to acquire PI space for use in
BGP and must use PA blocks from their transit providers
2) An opposite situation, where the DFZ does not contain any
de-aggregates, but contains many PI routes from organizations who have
their PI space announced by their ISP for the purpose of avoiding
re-numbering, not for multi-homing using their own BGP
routers/ASN/etc.

Getting to either one of these two extremes is very easy.  Right now,
we are heading for #1.  If all technical barriers for acquiring IPv6
PI were removed, we would probably have #2.  How do we find a medium
between them, where there aren't ASNs originating 1000+ unnecessary
de-aggregates out of their own carelessness and ineptitude, but also,
there isn't a /32 (or /48) announced for every mom  pop ISP who
themselves do not participate in BGP, or every corporate branch office
who do not want to renumber when they change ISPs?

This is how RIRs are failing us.  Except that the RIRs really can't
fail us, because they do what the members direct them to do through
policy.  If we don't task them to help the community do a better job
at managing the IPv6 DFZ now, we may never be able to go back and fix
it.  The genie is out of the bottle with IPv4; but realistically, IPv6
is young enough that we have plenty of wiggle-room in terms of
allocation policy, typical inter-domain route filters, and so on.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Yahoo and IPv6

2011-05-09 Thread Jeff Wheeler

On Mon, May 9, 2011 at 3:58 PM, Doug Barton do...@dougbarton.us wrote:
 I do agree with you that pointing fingers at this stage is really not
 helpful. I continue to maintain that being supportive of those content
 networks that are willing to wade in is the right answer.

Frankly, I think the finger is simply pointing in the wrong direction.
 I have zero choices for native IPv6 at home, and I'm sure that is
true for the majority of us.  SOHO CPE support barely exists because
access networks haven't been asking for it.  Call centers are
certainly not equipped to evaluate traceroute tickets or assist
users in any practical way, which is why we see disable IPv6 and try
again as the cookie-cutter answer to any problem when the end-user
has IPv6.

The expectation that content providers should rush to publish 
records by default (instead of white-listing, etc.) at a time when
even motivated end-users can't get IPv6 without resorting to tunnels
is ridiculous.  Let's be glad that these content providers have done
all the necessary prep work, such as deploying appropriate network
infrastructure and updating their software, so that they can choose to
send  responses when they want to.

This problem is, and always has been, on the access side.  Point your
fingers that way.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Finger pointing [was: Yahoo and IPv6]

2011-05-09 Thread Jeff Wheeler

On Mon, May 9, 2011 at 4:40 PM, Patrick W. Gilmore patr...@ianai.net wrote:
 Unfortunately, finger-pointing will not fix the problem.

Actually, finger-pointing is very helpful at this stage.  I was able
to change my local ISP's tune from we have enough IPv4 addresses for
our customers, so we aren't going to support IPv6 (ever) to we will
start employee beta testing soon.  It ultimately took the threat of
running an Op-Ed in the business section of the local paper to get
them to realize they can't continue with their plan to offer no IPv6
support at all.

With 800,000 SOHO CPE units deployed that have no IPv6 support and no
remote firmware upgrade option on the horizon, I can understand why
they hoped they could avoid ever supporting v6 -- it will cost them,
literally, a hundred million dollars to fix their CPE situation and
deploy native IPv6 if their CPE vendor can't provide a remote update.
This is also why tunneled solutions are receiving so much effort and
attention -- truck rolls and CPE replacement are huge expenses.

If we don't start pointing fingers at these access networks, they
won't get it until the pain of IPv4 depletion lands squarely on
content networks who may eventually be unable to get any IPv4
addresses for their services, or who may be forced to buy transit from
networks who have large, legacy IPv4 pools sitting around just to get
a provider allocation.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Yahoo and IPv6

2011-05-09 Thread Jeff Wheeler

On Mon, May 9, 2011 at 4:41 PM, Jared Mauch ja...@puck.nether.net wrote:
 I'd like to see more progress getting there than finger pointing.

I would, too; but one harsh reality is that vendors are driven by
RFPs, not by what they consciously know their customers will need in
the near future.  Why should vendors invest money in features that
aren't needed to sell routers?  If customers are dumb enough to buy
them anyway, they'll buy *another* router to get those features in the
future.

I do take issue with your suggestion that /64 LANs are in any way
smart in the datacenter.  They are not.  I have some slides on this
topic: http://inconcepts.biz/~jsw/IPv6_NDP_Exhaustion.pdf

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Yahoo and IPv6

2011-05-09 Thread Jeff Wheeler

On Mon, May 9, 2011 at 10:04 PM, Joel Maslak jmas...@antelope.net wrote:
 On Mon, May 9, 2011 at 3:57 PM, Jeff Wheeler j...@inconcepts.biz wrote:
 I do take issue with your suggestion that /64 LANs are in any way
 smart in the datacenter.  They are not.  I have some slides on this
 topic: http://inconcepts.biz/~jsw/IPv6_NDP_Exhaustion.pdf

 There are ways of mitigating this (the easiest is to use ACLs or firewalls
 to limit traffic into a subnet from untrusted sources so that only
 legitimate traffic is allowed).

Your suggestion has two main disadvantages:
1) it doesn't work on some platforms, because input ACL won't stop ND
learn/solicit -- obviously this is bad
2) it requires you to configure a potentially large input ACL on every
single interface on the box, and adjust that ACL whenever you
provision more IPv6 addresses for end-hosts -- kinda like not having a
control-plane filter, only worse

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: How do you put a TV station on the Mbone?

2011-05-05 Thread Jeff Wheeler

On Thu, May 5, 2011 at 1:55 AM, George Bonser gbon...@seven.com wrote:
 multicast. How do I encrypt something in a way that anyone can decrypt
 but nobody can duplicate?  If I have a separate stream per user, that is

Have you ever seen a CableCARD?  That's pretty much what it does,
except not anyone can decrypt it -- you need to subscribe to some TV
channels.  There has been quite a bit of work in black-boxing the
decryption of broadcast/multicast streams to make it difficult for
end-users to pirate the content.  That's why you see HDCP logos in the
marketing fluff for displays and graphics cards, etc.

 Encryption is probably overkill anyway.  What is needed is a mechanism
 simply to say that the content is certified to have come from the source
 it claims to come from.  So ... basically ... better not to use
 multicast for anything you really might have any security issues with.
 Fine for broadcasting a video, not so fine for a kernel update.

This is a solved problem.  Not only are you able to verify the
computed checksum of a downloaded file against the distributor's
published checksum, there are plenty of applications that do this for
you -- torrent programs check each chunk and throw away
malicious/erroneous ones.

There are certainly things that need work before I can start up Jeff's
Internet Movie Channel and go into competition with HBO, but for the
most part, these are solvable if networks decided to do it.  The big
limitation is there can't be infinite groups -- FIB is only so big and
there is no agreeable mechanism for sharing the number that can be
made to exist, given current (and foreseeable) routers.  Since so many
eyeballs are sitting on ISPs that also own television networks and
other media properties, though, I don't think we will get multicast
anytime soon.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: How do you put a TV station on the Mbone?

2011-05-04 Thread Jeff Wheeler

On Wed, May 4, 2011 at 12:45 PM, Leigh Porter
leigh.por...@ukbroadband.com wrote:
 Agreed, it seems the only demand really for this live viewing is sport, news
 and background programming like the mentioned breakfast television.

I disagree with the general notion that multicast is not useful except
for live content.  Allow me to give a couple of examples that would
probably be implemented if we really had a multicast-enabled Internet,
end-to-end:

WINDOWS UPDATES
Most of us have some number of Windows machines on our networks,
probably a large number.  These updates are pervasive, and yet they
are largely delivered to end-users as unicast downloads.  If we all
had mcast, the latest and greatest Windows Update would probably be
available via mcast, and your PC would join the appropriate group,
receive the update, and be able to install it, without any unicast
traffic at all.  There may be several groups for users who have
different access network speeds, and your machine may need to
fall-back to unicast to retrieve last week's updates or get
packets/chunks that it missed, but this is far from difficult to
implement.

ON-DEMAND MOVIES
While on-demand movies are unicast today, there's no reason a content
provider couldn't take advantage of multicast for the most popular
movies, let's say new releases.  We know that the latest movies are
more popular than older titles, because they consume much more shelf
space at Blockbuster, and more storage slots in the corner RedBox.  I
might receive the first few minutes of my on-demand movie by unicast,
and catch up to a high-speed multicast stream which repeatedly
plays the same movie, faster than the real-time data rate, for users
with sufficient access speed to download it.  My set-top-box would
transition from unicast to cached data it received via mcast,
resulting in a large bandwidth savings for popular titles.

As you can see, multicast can be useful for distribution of popular
time-shifted content and data, not just sports, news, and traditional
live programming.  Whether or not we ever see wide adoption of
multicast support on end-user access networks, well, that seems
increasingly unlikely given the consolidation of ISP/last-mile and
content producers/owners.  The less ISP networks look like common
carriers from a business perspective, the less motive they have to
act like a common carrier, and provide efficient, cost-effective
access to anything users wish to download.

For someone like Comcast, multicast is the ultimate boogie man.
End-users being able to originate content at low cost to anyone and
everyone, without expensive CDNs or network connectivity?  I could
start my own movie channel, license some indie films I want to stream,
throw some ads over them, and be in competition with traditional
television networks who pay for satellite transponders, negotiate for
carriage, etc.  There is no way a Comcast/NBC Universal would ever
make the mistake of giving their users unfettered access to multicast.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: How do you put a TV station on the Mbone?

2011-05-04 Thread Jeff Wheeler

On Wed, May 4, 2011 at 2:22 PM, Scott Helms khe...@ispalliance.net wrote:
 Local caching is MUCH more efficient than having the same traffic running in
 streams and depending on everyone's PC to try and update in the same time

This only works, of course, if there is a local cache which PCs are aware of.

 Same issue as above, even if I am watching the latest popular movie moving
 between a multicast and unicast stream everytime I pause it to get another
 beer isn't realistic.  The chances that there will be a multicast stream
 that will be in synch with me is not high at all.

You must have skipped over the word cache when reading my post.
I'll explain again in a little more detail, so you can understand why
the consumer who pauses the film to go get a snack is actually an
advantage for this system.

Let's say your typical movie is 5Mb/s and you want to start watching
it right away; you aren't willing to wait several minutes (or longer)
until the next multicast loop begins.  You press play and begin
receiving a 5Mb/s unicast stream, but your STB also joins an mcast
group for that movie, because it is very popular and being watched by
a huge number of users during peak time.  The mcast stream is 20Mb/s,
or 400% of real-time.  No matter what point the loop is at when you
join, you will cache the multicast data and eventually reach a point
in the movie where you no longer need the unicast stream.

Given a 2 hour movie, the worst-case is that you'll join just a minute
after the stream/loop started, in which case it will be about 30
minutes before you start viewing from multi-casted, STB-cached data,
instead of unicast streamed data.  With two subscribers watching the
movie given worst-case circumstances, there is a bandwidth
conservation of: (users - 1) * 5Mb/s * 90min, or a mean savings around
37%, for only two users.  If ten users are watching, your worst-case
bandwidth savings will be greater, 33.7Mb/s, or about 67%.

If, on the other hand, you start watching the movie, then realize it
would be more enjoyable with some popcorn, your STB is already
listening to the mcast stream and caching the movie for you.  The
longer it takes your popcorn to cook, the greater the chance that the
STB will start receiving mcast data for the beginning of the movie
before you un-pause it, which means you would not need the unicast
stream at all.

In fact, if you include the probability that some users will be able
to receive data via mcast earlier than 30 minutes into the movie,
because they didn't get unlucky and press play at the worst-case
moment, your bandwidth savings for a group of ten viewers and a 400%
real-time mcast stream will be about 80%.

The potential savings is limited by the over-speed of the mcast stream
vs real-time, and the density of mcast listener groups.  Given that
access network speeds continue to increase, yet ISPs are really not
increasing bandwidth caps, it is reasonable to assume that an ISP
might like to allow its subscribers to receive a very fast mcast
stream for a short period of time, instead of all of those subscribers
receiving many, slow mcast streams.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv4 address exchange

2011-04-19 Thread Jeff Wheeler

On Tue, Apr 19, 2011 at 12:16 PM, David Conrad d...@virtualized.org wrote:
 However, as far as I can tell, multiple registries isn't what is implicitly 
 being proposed.  What appears to be eing proposed is something a bit like the 
 registry/registrar split, where there is a _single_ IPv4 registry and 
 multiple competing 'post-allocation services' providers.

Are you saying there are people who advocate creating a new ecosystem
of service providers for supplying several things that the RIRs
exclusively supply today?  IN-ADDR delegation, WHOIS registration, and
... that's pretty much it, right?  People want to separate the DNS and
WHOIS database from ARIN and create new businesses to charge new fees
for providing that?

Sign me up.  As a vendor.  I'd love to over-charge for the dead simple
task of using an API to push DNS delegation updates to the IN-ADDR
servers, and running a whois server.  What a great business!  I'm sure
GoDaddy.com would be happy to add this service to their portfolio.

Where is the value for stakeholders?  If you really want WHOIS output
with a common, unified structure, you can do that.  Bulk access to RIR
data is available today.

Maybe I'm missing something, but I don't see how a bunch of different
entities providing fragmented post-allocation services is of any
benefit.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv4 address exchange

2011-04-19 Thread Jeff Wheeler

On Tue, Apr 19, 2011 at 2:37 PM, John Curran jcur...@arin.net wrote:
    Imagine for a moment that you had quite a few
 unneeded addresses and the upheaval also meant
 no pesky policy constraints on your monetization efforts -
 would you then view it as having some benefit?  You just
 might not have the right perspective to appreciate the
 potential up$ide...

In this view, then, the benefit of independent, fragmented WHOIS
databases and API access to IN-ADDR DNS zones is that addresses could
be traded outside of RIR policy.

It seems to me that RIR policy would need to change to allow such
third-party databases to publish delgation data to DNS/WHOIS.  Since
this is the case, end-user advocates of such system should simply
argue in favor of eliminating any justification for transfer
recipients.  In this case, ARIN would naturally supply the same DNS
and WHOIS service they do to allocation-holders today.

I still see no tangible benefit to third-party DNS/WHOIS databases,
except to the operators of those databases.  The up$ide seems to be
entirely in favor of new database operators, not existing
stakeholders.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

On Tue, Apr 19, 2011 at 4:14 PM, Benson Schliesser
bens...@queuefull.net wrote:
Meanwhile, under the current system, ARIN has managed to accumulate a $25M
cash reserve despite an increasing budget. (see
https://www.arin.net/participate/meetings/reports/ARIN_XXVII/PDF/Wednesday/andersen_treasurer.pdf)

If you want ARIN to reduce its fees, you can propose that. The
fiduciaries at ARIN may say, you're right, we do have more money than
we need or foresee to need to operate, and recommend that fees be
reduced. They may provide justification for this war chest, such as
the possibility of legal battles over address transfers. Who knows?

Is your problem that ARIN spends its money poorly? I believe it does
in some ways, but the community generally does not care enough to try
to improve this. I questioned ARIN's travel budget a few years ago
and was essentially flamed for doing so.

You seem to think the difference between ARIN's expenditures and
revenues is too large, resulting in a large cash reserve. Okay, if
that's important to you, there is a forum for that discussion. I
don't think anything will be done about it through a discussion on
NANOG, but you can certainly bring it up on the various ARIN mailing
lists, or ask ARIN board/staff to share their thoughts with you.

I really don't think the cost of ARIN fees for IP address and ASN
allocations are all that important to ARIN members. In my position as
a senior technical resource for numerous ARIN members, I am much more
interested in ARIN providing more services to members, or improving
upon existing ones (IRR), than I am in any reduction of fees. Again,
my position is reflected clearly in my public mailing list posts on
this subject.

Note that one of the things I think ARIN should improve upon, which
ARIN has committed to improve, is its IRR database. There are already
alternatives available, I'm glad ARIN has decided to increase the
usefulness and quality of its IRR database. If they don't, you can
still choose to use a third-party database.

I don't share your view that a fragmented WHOIS/DNS ecosystem would be
all that beneficial to stakeholders. In the absence of ARIN members
flocking to PPML to complain about ARIN's travel budget or its
increasing cash reserve, I don't think ARIN members are particularly
concerned about reducing ARIN's fees.

--
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator / Innovative Network Concepts

Re: IPv4 address exchange

2011-04-19 Thread Jeff Wheeler

On Tue, Apr 19, 2011 at 5:16 PM, Benson Schliesser
bens...@queuefull.net wrote:
 Without defining what an optimal cost might be, my comment was intended to 
 show that our current baseline already results in a surplus.

I don't think the cost of IPv4 addresses has anywhere to go but up.
This mysterious Nortel/Microsoft transaction would seem to give
credibility to an assumption of increasing cost.  Therefore, it stands
to reason that the cost of database services associated with being a
holder of IP addresses will be inconsequential.

If I wanted to own www.abc.com, I could do that for a pretty low cost
of  $20/year through the various dot-com registries.  I am pretty
sure ABC would not sell it to me for any price I could afford.  Thus,
the cost of that domain name lies not with the database services but
with the unique string.

If anyone thinks that won't be true for IP addresses, by all means,
let that person propose to overhaul the IN-ADDR system and possibly
the WHOIS database.  I do not think stakeholders will agree with their
views.  IP addresses are finite, and the cost of acquiring them will,
in all likelihood, dwarf the cost of publishing ownership/custodial
information or operational DNS records.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Implementations/suggestions for Multihoming IPv6 for DSL sites

2011-04-18 Thread Jeff Wheeler

2011/4/18 Lukasz Bromirski luk...@bromirski.net:
 LISP scales better, because with introduction of *location*
 prefix, you're at the same time (or ideally you would)
 withdraw the original aggregate prefix. And as no matter how
 you count it, the number of *locations* will be somewhat
 limited vs number of *PI* address spaces that everyone wants

I strongly disagree with the assumption that the number of
locations/sites would remain static.  This is the basic issue that
many folks gloss over: dramatically decreasing the barrier-to-entry
for multi-homing or provider-independent addressing will, without
question, dramatically increase the number of multi-homed or
provider-independent sites.

LISP solves this problem by using the router's FIB as a
macro-flow-cache.  That's good except that a site with a large number
of outgoing macro-flows (either because it's a busy site, responding
to an external DoS attack, or actually originating a DoS attack from a
compromised host) will cripple that site's ITR.

In addition, the current negative mapping cache scheme is far from
ideal.  I've written a couple of folks with a provably superior scheme
(compared to existing work), and have received zero feedback.  This is
not good.

 We may of course argue that the current routers are pretty
 capable in terms of processing power for control-plane, but

We agree that the ability to move tasks from the router to an external
control plane is good.  BGP may get better at this as time goes on,
too.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv4 address exchange

2011-04-18 Thread Jeff Wheeler

On Mon, Apr 18, 2011 at 7:33 PM, David Conrad d...@virtualized.org wrote:
 [ARIN] does not have full buy-in from those who they would try to regulate

ARIN has all the buy-in they need: No transit network will (except by
act of omission/mistake) allow you to announce IPs that aren't
registered to you in an RIR database, or delegated to you by the
registrant of those IPs.

I am unapologetic when it comes to ARIN.  They are very bad at a lot
of things, and they allow themselves to be railroaded by organizations
that have out-sized budgets / influence (see my post a few years ago
regarding Verizon Wireless.)  My list of ARIN gripes is as long as
the day, but I'll spare you the details.

If we didn't have ARIN, we would probably have one of two things:
1) no regulator at all, thus BGP anarchy (we came surprisingly close
to that in the 1990s at least once)
2) a worse regulator who is totally uninterested in the small ISP /
hosting shop / Fortune 50,000, as opposed to the Fortune 500

If ARIN's primary benefit to us is to protect us from these two
unarguably worse evils, they are doing a fine job.  Even from my
outsider's perspective, I understand that ARIN is sometimes forced to
make significant compromises, which we may find objectionable, to
prevent us from being truly thrown to the wolves.

Would I like ARIN to function better?  Sure, in plenty of ways.  I do
not think it would function better if it were just a WHOIS database.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: IPv4 address exchange

2011-04-18 Thread Jeff Wheeler

On Mon, Apr 18, 2011 at 10:35 PM, David Conrad d...@virtualized.org wrote:
 And yet, Ron has recently raged on this list about hijacked prefixes used for 
 spamming, so clearly no transit network is inaccurate.

I try to qualify my remarks when necessary.  In this case, I wrote
except by act of omission/mistake, and you evidently did not read
that carefully, or have construed transit network to mean any
two-bit ISP with one BGP customer (or shell company downstream of
them), rather than serious, global networks.

 Regardless, for sake of argument, let's assume ARIN refused to recognize the 
 Microsoft/Nortel sale and Microsoft deploys a few prefixes of those 666K 
 addresses for (say) new MSN services. Do you think ISPs, particularly the 
 larger ones, all over the world would refuse to accept those announcements 
 (especially when their call centers start getting calls from irate customers 
 who aren't able to gain access to MSN services)?

ARIN has very carefully allowed our industry to largely avoid this
choice, as InterNIC did before.  Their methods have sometimes been
objectionable, but the devil we know is better than the devil we
don't.

 1) no regulator at all, thus BGP anarchy (we came surprisingly close to 
 that in the 1990s at least once)

 And the solution to that BGP anarchy (by which I assume you mean a flood of 
 long prefixes)

No, I mean if ARIN had lost its perceived or actual legitimacy, and
networks really were able to permanently hijack whatever IPs they
decided to claim for themselves, we would have had anarchy at worst,
or more likely, transit-free ISPs with commercial interest in
customers not having portable address space controlling all
allocations of portable addresses.

This almost happened.

 We're talking about IPv4 addresses which will (soon) be unavailable

I'm not confused about that.  If it were up to me, I would simply
freeze all IPv4 allocations immediately.  I do not think the current
sale-and-transfer scheme is good.  I also don't *care* that much,
because the more screwed up the legacy IPv4 Internet becomes, and
the faster it gets there, the better it is for my business.  I'm
pretty sure I am not alone in this thinking.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Implementations/suggestions for Multihoming IPv6 for DSL sites

2011-04-13 Thread Jeff Wheeler

On Tue, Apr 12, 2011 at 4:59 AM, Luigi Iannone
lu...@net.t-labs.tu-berlin.de wrote:
 This is not true. There are several works out there showing that the FIB will 
 not grow as you are saying.

Having taken some time to discuss this off-list with Luigi.  I'd
already read the paper he had in mind, which does not address DoS or
prefix growth as the number of multi-homed sites, or single-homed
sites with PI blocks, increases.

In effect, that paper and other works on this subject fail to consider
what happens when one of LISP's goals actually becomes true: more
wide-spread adoption of its technology to enable branch offices and
other end-users to become multi-homed, or avoid renumbering.

Plain and simple, it does not scale up any better than injecting more
routes into the DFZ, unless you 1) accept macro-flow-based routing; or
2) scale up the size of your FIB along with the much larger number of
prefixes which would be introduced by lowering the barrier-to-entry
for multi-homing and provider-independent addressing.

However, LISP does have non-Internet applications which are
interesting.  You can potentially have multi-homed connectivity
between your own branch offices, using one or more public Internet
connections at each branch, and your own private mapping servers which
know the state of reachability from one branch to the others.  In
effect, it can become poor man's L3VPN.

Beyond non-Internet applications such as this, I think LISP is useful
largely as a case study for what happens when a bunch of engineers get
together and solve some problems they do not understand -- DFZ
size/growth being chief among them.

Like others, I still leave room for the possibility that I am wrong about this.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Implementations/suggestions for Multihoming IPv6 for DSL sites

2011-04-11 Thread Jeff Wheeler

On Mon, Apr 11, 2011 at 11:26 AM, Owen DeLong o...@delong.com wrote:
 I'd agree with you if it weren't for the fact I keep thinking I just about 
 understand LISP and then get told
 that my understanding is incorrect (repeatedly).

I agree it is not simple.

At a conceptual level, we can think of existing multi-homing practices
as falling into one of three broad categories:
1) more state in DFZ -- end-site injects a route into BGP

2) triangular routing -- tunnel/circuits/etc to one or more upstream
routers while not injecting anything to DFZ

3) added work/complexity on end-host -- SCTP and friends

LISP is a compromise of all these things, except #3 happens on a
router which does tunneling, not the end-host.  Whether you think it's
the best of both [three?] worlds, or the worst of them, is up to
you.

I personally believe LISP is a horrible idea that will have trouble
scaling up, because a large table of LISP mappings is not any easier
to store in FIB than a larger DFZ.  The solution the LISP folks
think works for this is a side-chain mapping service which the router
can query to setup encapsulation next-hops on-demand, which means if
your FIB isn't big enough to hold every mapping entry, you are
essentially doing flow-based routing, but with flows defined as
being toward a remotely-defined end-site rather than toward an
individual IP address (so not quite as bad as flow-based routing of
the past, but still bad.)

Maybe I also don't understand LISP and need to RTFM more, but my
current understanding is that it is a dead-end technology without the
ability to dramatically scale up the number of multi-homed end-sites
in a cheaper manner than what is done today with BGP.

I think we would be better off with more work on things like SCTP.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Implementations/suggestions for Multihoming IPv6 for DSL sites

2011-04-11 Thread Jeff Wheeler

On Mon, Apr 11, 2011 at 2:03 PM, Owen DeLong o...@delong.com wrote:
 I do tend to think that any technology sufficiently confusing that I cannot
 understand it well after reasonable effort is of questionable value
 for wide deployment.

The secret is to ignore all the crazy acronyms and boil it down to
this -- LISP sets up tunnels to remote end-points based on what it
learns from a mapping server, and these tunnels may be used by one or
more end-to-end flows.

 I personally believe LISP is a horrible idea that will have trouble
 scaling up, because a large table of LISP mappings is not any easier
 to store in FIB than a larger DFZ.  The solution the LISP folks
 This is one of the few parts of LISP I do understand and I'm not entirely
 convinced that it is all that bad because you don't have to do this on
 core routers, you can push it out pretty close to the customer edge,
 possibly even on the customer side of said edge.

We already have this in the core today, thanks to MPLS.  The problem
with LISP is the router that does encapsulation, which you can think
of as conceptually identical to a PE router, must have a large enough
FIB for all simultaneous flows out of the customers behind that PE
router.  This may be a very large number for an end-user PE router
with a bunch of subscribers behind it running P2P file sharing, and
may also be very large for a hosting shop with end-users from all over
the globe downloading content.  In the case of a CDN, one distributed
CDN node may have far fewer active flows (installed in FIB) than the
size of the DFZ, since the CDN would intend to direct end-users to a
geographically-local CDN node.

As you know, I like to think of what happens when you receive a DDoS.
In the case of LISP, if there are a huge number of source addresses
sending just one packet to you that generates some kind of reply, your
PE router will query its mapping server, install a new
tunnel/next-hop, and transmit the reply packet.  If the FIB is not
large enough to install every flow, it will churn, creating a DoS
condition essentially identical to what we saw with older flow-cache
based routers when they were subjected to traffic to/from a very large
number of hosts.

Like you, I am not 100% sure of my position on LISP, but I do think I
understand it has a very serious design limit that probably doesn't
make things look any better than polluting the DFZ from the
perspective of content providers or end-user ISPs.  It does have
benefits from the carrier perspective because, as you say, it can move
the PE router into the customer's network and move state information
from the carrier to the edge; but I think this comes at a high
complexity cost and might result in overall more work/cost for
everyone.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Level 3 Agrees to Purchase Global Crossing

2011-04-11 Thread Jeff Wheeler

If I were a large tier-2 with SFI to one, but not both, of Level3 and
GBLX, I would see this acquisition as an opportunity to squeeze
peering out of the other network, or eventual combination of both, in
trade for not stirring the pot with regulators.  Perhaps AS3356 will
carry AS6939 IPv6 routes soon, etc.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: [torix-ops] Fabric Issues Update

2011-04-08 Thread Jeff Wheeler

Netelligent's sessions are also down to allow for troubleshooting
without disrupting customer traffic, and we'll turn back up once TORIX
indicates everything is okay.

For any members who might have a usage-based billing for carrier
transport to TORIX, it is worth mentioning that if you see extra
junk traffic coming to your port, it is likely your transport
provider would bill you for this traffic even though it is not bound
for your router.  For example, I see about 350Mbps of junk right now
with our BGP sessions down, so if we had to pay per-Megabit for
backhaul it would push that figure up for the month.

If a member in this situation wanted to avoid a larger bill they would
need to turn down their port in order to avoid being charged for the
traffic, as deactivating BGP obviously only affects your good
ingress and not the junk.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: State of QoS peering in Nanog

2011-04-02 Thread Jeff Wheeler

On Sat, Apr 2, 2011 at 5:56 PM, Leo Bicknell bickn...@ufp.org wrote:
 The PSTN features fixed, known bandwidth.  QoS isn't really the
 right term.  When I nail up a BRI, I know I have 128kb of bandwidth,
 never more, never less.  There is no function on that channel similar
 to IP QoS.

The PSTN also has exactly one unidirectional flow per access port.
This is not true of IP networks, where an end-user access port may
have dozens of flows going at once for common web browsing, and
perhaps hundreds of flows when using P2P file sharing applications,
etc.  The lifetime of these flows may be several hours (streaming
movie) or under a second (web browser.)

Where the PSTN has channels between two access ports (which might be
packetized within the backbone) and a relatively complex control plane
for establishing flows, the IP network has little or no knowledge of
flows, and if it does have any knowledge of them, it's not because a
control plane exists to establish them, it's because punting from the
data plane to the control plane allows flow state to be established
for things like NAT.

 Basically, you could mandate QoS on every peering link in the
 Internet and I suspect 99% of the end users would never notice any
 change.

I don't agree with this.  IMO all DDoS traffic would suddenly be
marked into the highest priority forwarding class that doesn't have an
absurdly low policer for the DDoS source's access port, and as a
result, DDoS would more easily cripple the network, either from
hitting policers on the higher-priority traffic and killing streaming
movies/voip/etc, or in the absence of policers, it would more easily
cause significant packet loss to best-effort traffic.

I think end-users would notice because their ISP would suddenly grind
to a halt anytime a clever DDoS was directed their way.

We will no sooner see a practical solution to this than we will one
for large-scale multicast in backbone and subscriber access networks.
The limitations are similar: to be effective, you need a lot more
state for multicast.  For a truly good QoS implementation, you need a
lot more hardware counters and policers (more state.)  If you don't
have this, all your QoS setup will do, deployed across a large
Internet subscriber access network, is work a little better under
ideal conditions, and probably a lot worse when subjected to malicious
traffic.

 2) Get access ISPs to offer QoS on customer access ports, ideally in
   some user configurable way.

I do agree that QoS should be available to end-users across access
links, but I don't agree with pushing it further towards the core
unless per-subscriber policers are available beyond those on access
routers.  Otherwise, all someone has to do to be mean to Netflix is
send a short-term, high-volume DoS attack that looks like Netflix
traffic towards an end-user IP, which would interrupt movie-viewing
for a potentially larger number of users, or at least as many
end-users as the same DoS would in the absence of any QoS.  The case
of per-subscriber policers pushed further towards the ISP core fares
better.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Regional AS model

2011-03-28 Thread Jeff Wheeler

On Mon, Mar 28, 2011 at 5:40 PM, Owen DeLong o...@delong.com wrote:
 I agree that allowas-in is not as bad as default, but, I still think that 
 having one AS per routing policy makes a hell of a
 lot more sense and there's really not much downside to having an ASN for each 
 independent site.

Well, let's say I'm a a medium/large transit network like Hurricane
Electric, with a few far-flung POPs that have backup transit.  I've
got a POP in Miami, Minneapolis, or Toronto which has single points of
backbone failure, e.g. one circuit/linecard/etc might go down, while
the routers at the POP remain functional, and the routers in the rest
of the network remain functional.  What happens?

1) with allowas-in your remote POP will still learn your customers'
routes by any transit you might have in place there
2) with default route toward transit (breaking uRPF) you would not
learn the routes but still be able to reach everything
3) with neither of these solutions, your single-homed customers at the
broken POP could not reach single-homed customers elsewhere on your
backbone, even if you have backup transit in place.

I'm not bashing on HE for possibly having a SPOF in backbone
connectivity to a remote POP.  I'm asking why you don't choose to use
a different ASN for these remote POPs.  After all, you prefer that
solution over allowas-in or default routes.

Oh, that's right, sometimes you have a business and/or technical need
to operate a single global AS.  Vendors have given us the necessary
knobs to make this work right.  There's nothing wrong with using them,
except in your mind.

Should every organization with a backbone that has an SPOF grab some
more ASNs?  No.  Should every organization with multiple distinct
networks and no backbone use a different ASN per distinct network?
IMO the answer is probably yes, but I am not going to say it's always
yes.

I'll agree with you in a general sense, but if your hard-and-fast rule
is that every distinct network should be its own ASN, you had better
start thinking about operational failure modes.  Alternatively, you
could allow for the possibility that allowas-in has plenty of
legitimate application.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: The growth of municipal broadband networks

2011-03-25 Thread Jeff Wheeler

On Fri, Mar 25, 2011 at 10:52 PM, George Bonser gbon...@seven.com wrote:
 I don't.  What happens when the government then decides what content
 is and is not allowed to go over their network?  If one had a site that
 provided a view that the government didn't like, would they cut it off?

I appreciate your argument.

When asked by Uncle Sam, the major RBOCs were apparently happy to hand
over customers' records and tap into their phones in direct violation
of the law.  *Asked* not ordered by a court or any legally-empowered
person or entity.  The companies and LEOs then had to fight for
RETROACTIVE PROTECTION FROM THEIR WILLFUL VIOLATIONS OF THE LAW, which
was granted by our federal legislature.

I think we would be far, far better off, from the perspective of
liberty, with a thousand small last-mile providers, some of which will
hopefully be owned by cities/counties/states and some of which would
hopefully be privately-operated.  It's a lot harder to coerce (or just
ask) a thousand small access providers to block some objectionable
or dangerous content or activity without getting caught than it is
to do the same if there are only a handful of access providers.

Since there is no liberty advantage, in the real world, to a system
where ATT controls the last-mile or states, counties, or private
contractors control same, I would choose the one most likely to create
a competitive business environment.  We already know that homes
without cable television and Internet service are less valuable than
homes which have access to these services.  I hope that communities
would develop and maintain the best last-mile networks they can in
order to attract businesses and residents with the most money to
spend, and the most to contribute to their tax bases, job market, and
skilled labor pool.

In an ideal world, I could agree with you.  But you don't need a
tin-foil hat anymore to be absolutely certain that big brother has
over-stepped his bounds and will continue to do so even in an
environment where private businesses *could* be an obstacle.  Guess
what, they aren't.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Regional AS model

2011-03-24 Thread Jeff Wheeler

On Thu, Mar 24, 2011 at 5:51 PM, Graham Wooden gra...@g-rock.net wrote:
 with one site being in the middle. I only have one public AS, but I have
 selected doing the confederation approach (which some may consider to be
 overkill with only three edges).

There are really several issues to consider, one of which certainly is
overkill, but the others are:
1) in your case, you have to run allowas-in *anyway* because if your
transport or your middle POP goes down, so will your network and its
customers; so confederating isn't really buying you anything unless
your backbone is really vendor L3VPN
2) confederating / clustering can add to MED headaches and similar

While this is not directed at your deployment specifically, it is a
common newbie mistake to confederate something that doesn't need to
be, or to otherwise complicate your backbone because you think you
need to turn knobs to prepare for future growth.  Guess what, that
growth might happen later on, but if you don't understand emergent
properties of your knob-turning, your plan for the future is really a
plan to fail, as you'll have to re-architect your network at some
point anyway, probably right when you need that scalability you
thought you engineered in early-on.

List readers should be strongly discouraged from confederating unless
they know they need to, understand its benefits, and understand its
potential weaknesses.  In general, a network with effectively three or
six routers should never have a confederated backbone.  The number of
guys who really understand confederating / route-reflection within the
backbone is very small compared to the number of guys who *think* they
are knowledgeable about everything that falls under router bgp, our
beloved inter-domain routing protocol which gives the operator plenty
of rope with which to hang himself (or the next guy who holds his job
after he moves on.)

On Thu, Mar 24, 2011 at 7:50 PM, Jeffrey S. Young yo...@jsyoung.net wrote:
 On the other hand if we'd had this capability years ago the notion
 of a CDN based on anycasting would be viable in a multi-provider
 environment.  Maybe time to revive that idea?

That draft doesn't identify any particular technical challenges to
originating a prefix from multiple discrete origin ASNs other than the
obvious fact that they'll show up in the various inconsistent origin
AS reports such as CIDR Report, etc.  It of course does identify some
advantages to doing it.

I imagine Danny McPherson and his colleagues have spent some time
looking into this, and can probably say with confidence that there are
in fact no real challenges to doing it today besides showing up in
some weekly email as a possible anomaly.  It seems to be a taboo
topic, but once a few folks start doing it, I think it'll quickly
become somewhat normal.

Note that in the current IRR routing information system, it is
possible to publish two route objects, each with the same prefix, and
each with a different origin ASN.  This is by design.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Nortel, in bankruptcy, sells IPv4 address block for $7.5 million

2011-03-24 Thread Jeff Wheeler

What is needed is for the networks in the transit-free club to decide
they will not honor any gray market route advertisements resulting
from extra-normal transfers of this nature, whether the announcement
is from a peer or a customer.  As we are all aware, no real dent was
ever made in routing table growth except by Sprint deciding what it
was willing to accept.

The up-side to a huge, unchecked gray market benefits bad guys, such
as spammers, much more than it does ordinary operators and end-users,
on this I think we can all agree.

The recent thread on DFZ growth also demonstrates clearly that
uncertainty as to whether or not such an unchecked gray market will be
allowed to exist, or even thrive, is driving most of us to strike
routers with 500k FIB from our list (many of us have been doing so for
years.)  This means that the uncertainty has already created cost for
operators and thus end-users.

The sooner the big players get together on this and decide not to
allow such a gray market, the better off we will be.  Since some of
these big players have huge legacy address pools already, there is
little disadvantage to those networks refusing to honor gray market
announcements from their customers, and probably no disadvantage to
accepting them from peers, as long as they are not the sole actor.

I anxiously await an xtra-normal announcement forbidding extra-normal routes.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: SP's and v4 block assignments

2011-03-20 Thread Jeff Wheeler

On Sun, Mar 20, 2011 at 3:28 AM, Owen DeLong o...@delong.com wrote:
 This assumes an HFC network and not a PON or DSL topology
 where it is not an issue.

It assumes that the access network topology does not employ any kind
of triangular routing to terminate the subscriber's layer-3 traffic on
a desired access router, as opposed to one dictated by where the
subscriber's layer-1 facility terminates.

It's really not an issue of HFC or DSL, and I guess I should have
spelled it out since several folks failed to understand that -- it's
an issue of carrying routes for customer static IPs in your IGP or
being able to steer their sessions to a certain device.

I'm sure we all remember the days when ordinary dial-up subscribers
could get a static IP address from nation-wide dial-up ISPs, and the
network took care of routing that static IP to whatever box was
receiving the modem call.  The problems with scaling up static IPs for
fixed-line services are much less troublesome than a nation-wide
switched access service like dial-up; but the same basic constraints
apply -- you need triangular routing, or a bigger routing table, when
users' static IPs are not bound to an aggregate pool for their layer-3
access router.

Almost Static IPs, which remain unchanged until your ISP has some
need to reorganize their access network and move you into a different
IP address pool, are a good compromise that are okay for many
end-users.  That eliminates all the technical challenges (from the ISP
perspective) and yet there are many ISPs that offer this product only
to business customers, not ordinary residential subscribers -- which
means you're still left with the issue that they simply don't want to
offer anything like a static IP to the lowest-margin customers, as
they hope it will force some subscribers to upgrade to a higher-cost
service.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: CSI New York fake IPv6

2011-03-20 Thread Jeff Wheeler

On Sun, Mar 20, 2011 at 10:21 PM, Jay Ashworth j...@baylink.com wrote:
 No, there are several reserved stretches of both IPv4 and DNS space
 for just such reasons.  example.com is the most common and well known,
 but see also RFC 3330 and RFC 5737, not necessarily in that order.

See also this thread
http://mailman.nanog.org/pipermail/nanog/2011-March/034179.html from
less than two weeks ago for discussion of this in relation to IPv6.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: SP's and v4 block assignments

2011-03-19 Thread Jeff Wheeler

On Sat, Mar 19, 2011 at 11:53 AM, Nathan Eisenberg
nat...@atlasnetworks.us wrote:
 As for charging for residential static assignments, I don't think it's all 
 that odd, or 'despicable'.  Allocating static assignments consumes engineer 
 time for configuration and documentation.  On a business class service, you 
 can eat that cost fairly easily.  On a low-yield residential circuit, there 
 has to be some long term ROI because that work probably takes the margin out 
 of the service for months.

Engineer time is not an issue.  If it requires an engineer for
configuration and documentation, the provisioning process is
already flawed.  The reason to not want residential users to have
static IPs is that these users represent large chunks of traffic which
can be easily moved from one group of HFC channels to another when
additional capacity must be created by breaking up access network
segments.  If many users had a static IP, this would be more
difficult.  Since most users don't have a static IP, the overhead of
dealing with the few users who do is entirely manageable, especially
when these users are paying a higher fee.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: bfd-like mechanism for LANPHY connections between providers

2011-03-16 Thread Jeff Wheeler

On Wed, Mar 16, 2011 at 2:33 PM, Jensen Tyler jty...@fiberutilities.com wrote:
 We have many switches between us and Level3 so we don't get a interface 
 down to drop the session in the event of a failure.

This is often my topology as well.  I am satisfied with BGP's
mechanism and default timers, and have been for many years.  The
reason for this is quite simple: failures are relatively rare, my
convergence time to a good state is largely bounded by CPU, and I do
not consider a slightly improved convergence time to be worth an
a-typical configuration.  Case in point, Richard says that none of his
customers have requested such configuration to date; and you indicate
that Level3 will provision BFD only if you use a certain vendor and
this is handled outside of their normal provisioning process.

For an IXP LAN interface and associated BGP neighbors, I see much more
advantage.  I imagine this will become common practice for IXP peering
sessions long before it is typical to use BFD on
customer/transit-provider BGP sessions.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: bfd-like mechanism for LANPHY connections between providers

2011-03-16 Thread Jeff Wheeler

On Wed, Mar 16, 2011 at 4:42 PM, Jensen Tyler jty...@fiberutilities.com wrote:
 Correct me if I am wrong but to detect a failure by default BGP would wait 
 the hold-timer then declare a peer dead and converge.

 So you would be looking at 90 seconds(juniper default?) + CPU bound 
 convergence time to recover? Am I thinking about this right?

This is correct.  Note that 90 seconds isn't just a Juniper default.
 This suggested value appeared in RFC 1267 §5.4 (BGP-3) all the way
back in 1991.

In my view, configuring BFD for eBGP sessions is risking increased
MTBF for rare reductions in MTTR.

This is a risk / reward decision that IMO is still leaning towards
lots of risk for little reward.  I'll change my mind about this
when BFD works on most boxes and is part of the standard provisioning
procedure for more networks.  It has already been pointed out that
this is not true today.

If your eBGP sessions are failing so frequently that you are very
concerned about this 90 seconds, I suggest you won't reduce your
operational headaches or customer grief by configuring BFD.  This is
probably an indication that you need to:
1) straighten out the problems with your switching network or transport vendor
2) get better transit
3) depeer some peers who can't maintain a stable connection to you; or
4) sacrifice something to the backhoe deity

Again, in the case of an IXP interface, I believe BFD has much more
potential benefit.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: bfd-like mechanism for LANPHY connections between providers

2011-03-16 Thread Jeff Wheeler

On Wed, Mar 16, 2011 at 8:00 PM, Sudeep Khuraijam
skhurai...@liveops.com wrote:
 There a difference of several orders of magnitude  between BFD keepalive 
 intervals  (in ms) and BGP (in seconds) with generally configurable 
 multipliers vs. hold timer.
 With Real time media and ever faster last miles, BGP hold timer may find 
 itself inadequate, if not in appropriate in some cases.

For eBGP peerings, your router must re-converge to a good state in  9
seconds to see an order of magnitude improvement in time-to-repair.
This is typically not the case for transit/customer sessions.

To make a risk/reward choice that is actually based in reality, you
need to understand your total time to re-converge to a good state, and
how much of that is BGP hold-time.  You should then consider whether
changing BGP timers (with its own set of disadvantages) is more or
less practical than using BFD.

Let's put it another way: if CPU/FIB convergence time were not a
significant issue, do you think vendors would be working to optimize
this process, that we would have concepts like MPLS FRR and PIC, and
that each new router product line upgrade comes with a yet-faster CPU?
 Of course not.  Vendors would just have said, hey, let's get
together on a lower hold time for BGP.

As I stated, I'll change my opinion of BFD when implementations
improve.  I understand the risk/reward situation.  You don't seem to
get this, and as a result, your overly-simplistic view is that BGP
takes seconds and BFD takes milliseconds.

 For a provider to require a vendor instead of RFC compliance is sinful.

Many sins are more practical than the alternatives.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Why does abuse handling take so long ?

2011-03-13 Thread Jeff Wheeler

On Sun, Mar 13, 2011 at 7:45 AM, Alexander Maassen
outsi...@scarynet.org wrote:
 In most cases the only thing the abuse@ contacts do as hoster, is relay
 the mail to the client but do not dare to do anything themself, even if

The RIPE IRR database contains a systemic means for operators,
responsible for IP address blocks, to exchange PGP-signed messages
amongst each-other in relation to security incidents.  It
unfortunately does not see much use: under 1% of allocations in RIPE's
database include any reference to one of only 235 incident response
teams, which are conceptually similar to a POC.

Other things have been tried but haven't reached critical mass also,
such as dial-by-ASN VOIP connectivity.

The real problem with handling serious network abuse is it's pretty
hard to get through the bozo filter and actually reach anyone who
might understand your request or complaint (DDoS), let alone have the
power to act.  The anti-spam folks have honestly made this problem
far, far worse, by slamming every role mailbox they can find for every
network operator, regardless of whether or not a specific mailbox for
email-related abuse exists or how good (or bad) a network may be at
keeping spam off its network.  I hope this remark doesn't steer the
thread far off-topic, but I wish the anti-spam folks would realize how
counter-productive it is to intentionally send the same complaints to
a multitude of different abuse mailboxes.

For this reason, it really is necessary to have an automatic filtering
mechanism in place just to make sure the network abuse people don't
have to sift through messages which are mostly related to email abuse.

If operators would decide to use a system like IRT, supported in RIPE
IRR, then we would not only be able to filter out a lot of the B.S.,
we would also know that signed messages complaining of DDoS coming in
were actually from the security folks at the complaining organization,
people who have authority to make requests on behalf of the org that
owns related netblocks.

This pretty much eliminates the why should I believe your evidence?
argument, because we shouldn't have to believe anyone's evidence to at
least block traffic towards the netblocks they operate.

For example: if I am an end-user with address 192.0.2.80 and my web
site is being subject to DDoS which I believe is originating from
203.0.113.66, I would contact my ISP, who registers themselves as the
IRT for 192.0.2.0/24.  My ISP would probably do a sanity check on my
claim, examine their netflow, etc. and then agree that 203.0.113.66 is
a source of the DDoS.  They'd see that an IRT is registered for
203.0.113.0/24 and send over a PGP-signed message to the counter-party
IRT.  That IRT would verify the PGP signature and association with the
target of the DoS, 192.0.2.80, and at that point, they would have
absolutely zero excuse for not immediately dropping all traffic from
203.0.113.66 towards me at 192.0.2.80.

It doesn't matter if there are any logs or evidence, it matters that
the proven security/abuse contact for 192.0.2.0/24 requested that the
counter-party stop sending traffic to 192.0.2.0/24.  Whether or not
the ISP for 203.0.113.66 decides to investigate any further is up to
them; maybe they log some traffic, find a compromised host, and shut
it down.  Maybe they really don't care.

Now that you know people are capable of doing all that based on data
in RIPE's trusted IRR database, you may also realize that this process
could be streamlined to any point between human reads email, checks
relationships, and configures network all the way to script reads
email, checks relationships, and configures network.  Implementing
this could save NOCs time (if they really cared about outgoing DDoS
from their networks) and improve response to network abuse.

So ultimately, there is already a good framework in place to
substantially fix this problem.  No one uses it.  That is unlikely
to change until there is an economic incentive, such as a lawsuit by
someone targeted by DoS which can be proven to be originated from a
negligent network, causing calculable damages.  Until some network has
to pay out a million bucks because they sat on their hands, I don't
see anything changing.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: estimation of number of DFZ IPv4 routes at peak in the future

2011-03-13 Thread Jeff Wheeler

On Sun, Mar 13, 2011 at 1:27 PM, Christopher Morrow
morrowc.li...@gmail.com wrote:
 there's probably a different need in TOR and BO/SOHO locations than
 core devices, eh?

In today's backbone, this is certainly true.  Feature-driven upgrades
shouldn't be much of a factor for P boxes today, because modern
networks have the option of simply label-switching in the core (just
like 1990s networks could ATM/Frame-switch) without doing much of
anything else.  Feature-driven upgrades should be largely confined to
PE boxes.

For the same reason, upgrading a P box should be easy, not hard.
After all, it's just label-switching.  In today's backbones, it should
be more practical than ever to buy the most cost-effective box needed
for now and the predictable near-term.  Cost per gigabit continues to
fall.  Buying dramatically more capacity than is planned to be
necessary sinks capital dollars into a box that does nothing but
depreciate.

I realize that organizationally-painful budgeting and purchasing
processes often drive networks to buy the biggest thing available.
Vendors understand this, too: they love to sell you a much bigger box
than you need just because upgrading is hard to get approved so you
don't want to do it any more frequently than necessary, even when that
behavior is detrimental to cash-flow and bottom line.  The more broken
your organization, the more you need to spend extra money on too big
boxes.  Sounds pretty self-defeating, doesn't it?

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: estimation of number of DFZ IPv4 routes at peak in the future

2011-03-13 Thread Jeff Wheeler

On Sun, Mar 13, 2011 at 3:42 PM, Christopher Morrow
morrowc.li...@gmail.com wrote:
 not everyone drinks the mpls koolaide... so it's not always 'just a
 label switch' and depending upon how large your PE mesh is, there are

If it isn't just a label switch, then features can (and sometimes do)
drive upgrades (therefore costs.)

 not need that info, but the edge likely does, yes? Have 100g customers
 today? planning on having them in the next ~8/12/18 months?

If you did your purchasing the way Bill Herrin suggests, you'd buy a
box with 100GbE ports for a POP or branch that is not projected to
have 100GbE customers, just because it's the biggest box.  His
position is that man-power to do an upgrade is always more costly than
capital dollars for the actual equipment, and ignores the fact that
the biggest box is by no means guaranteed to offer new *features*
which may be required.

I think most of your post is responding to a mis-read of my post, so
I'll skip back to the FIB size question at hand:

 sometimes... sometimes it's just business. I suppose the point here is
 that a box doesn't live ~12 months or even 24, it lives longer.
 Planning that horizon today is problematic when a box today (even the
 largest box) tops out just north of 2m routes (v4, I forget the mix
 v4/6 numbers). your network design may permit you to side step that
 issue in places, but planning for that number is painful today.

I'm not comfortable making the generalization that buying the box with
the largest available FIB is always the most cost-effective choice.
In some box roles, traffic growth drives upgrades, and increased FIB
size in future boxes will be one advantage of a future upgrade that
also increases port speed or density.  In other box roles, features
drive upgrades, and again, FIB size may increase in future boxes which
will be bought anyway to gain desired features.

It's foolish and overly-simplistic to assume that every box upgrade
will be driven by an eventual exhaustion of FIB capacity.

Currently, FIB capacity is being driven by the needs of service
providers' VPN PE boxes.  This is great for networks that do not have
that need, because it is driving FIB capacity up (or cost down) and
further reducing the chance that FIB exhaustion will trigger an
upgrade before other factors, such as port speed/density/features.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

Re: Why does abuse handling take so long ?

2011-03-13 Thread Jeff Wheeler

On Sun, Mar 13, 2011 at 5:33 PM, Florian Weimer f...@deneb.enyo.de wrote:
 Not that the IRTs are often not the party you want to talk to anyway.

This is why my post highlights the underlying mechanism/system.  It
can and should be used to streamline DDoS mitigation.  It is
unfortunately not in practical use, since the cost of ignoring DoS
originating from one's network is generally low or zero.

-- 
Jeff S Wheeler j...@inconcepts.biz
Sr Network Operator  /  Innovative Network Concepts

1 2 >

1 - 100 of 156 matches

Mail list logo