Re: [rrg] [Int-area] Please respond: Questions from the IESG as to whether a ...

HeinerHummel Sun, 25 Jan 2009 09:21:19 -0800

 
Dino,
Just a simple question: Is the ALT GRE-tunnel hierarchy a tree ?
 
Heiner
 
 
In einer eMail vom 24.01.2009 20:51:53 Westeuropäische Normalzeit schreibt  
[email protected]:


>   LISP has too many fundamental problems to be  considered a
>   potentially practical solution.  Some of  these problems require

LISP is not perfect and still evolving but let  me point out that every  
design has fundamental problems. At this  scale, it's very hard to be  
perfect Robin. And as Noel has stated so  many times, the hard  
decisions must be made to incrementally deploy  this. So things might  
not look pretty on the surface, but jeez, you  got to do what you got  
to do. Else, this will all be  academic.

We really want to solve this problem, sincerely. This is not  lip  
service, we are not just writing papers, we want to rev the  Internet,  
this is serious.

I want to address each of your 7  points below. I have cc'ed the 
[email protected] 
mailing list so the  folks who want to focus on details of LISP can  
see this  thread.

> 1 - Delays in delivering initial packets in a flow.   This is either
>    due to sending the packets along the ALT  network (which takes
>    time and involves sending  substantial volumes of data packets
>    over the ALT network,  rather than just mapping requests) or to
>    sending mapping  requests only, and waiting for the ITR to get a
>    response  before it attempts to send traffic packets.

We have a memory/bandwidth  tradeoff. So we have to make a hard design  
call. I'd rather have  mappings cached and timed out so they can be  
updated when they need  to then to hold all the possible mappings for  
the Internet in an  ITR.

There is no such thing of a free lunch. You either store all  possible  
mappings or you fetch them when you need  them.

>    According to:  http://www.lisp4.net/docs/lisp-ausnog02.ppt pp 10
>    &  11 the experimental LISP ALT network's ITRs drop initial
>     packets and send brief map request messages.   Even if we  think
>    this delay doesn't matter, I am sure enough  potential adopters
>    would - and therefore make it  difficult or impossible for LISP or
>    any other such  solution to be widely enough adopted to solve the
>    routing  scaling problem.

Then why was/is ARP deployed in a 10^4 host-based  bridged network with  
basically the same properties. First packet  loss is not persistent  
when you look at all the traffic that  originates from a source site to  
another destination  site.

The host vendors are probably thinking, if we do LISP in the  host, you  
could wait to send your TCP SYN before the mapping is  available. Guess  
what, you don't send that TCP SYN before you get  the DNS Reply. ;-) I  
wonder why? Well I'll spell it out. If a host  needs an address to make  
the connection, we can say it also needs a  locator to make a connection.

The design choice of LISP *is to just  change software in CPE devices*  
to get the feature of decoupling  rather than changing all the hosts at  
a site. That's a huge  deployment advantage. So these CPE devices get  
packets before they  have mappings. We don't even want to consider a  
host to CPE router  signaling protocol and all it's complexities to  
solve this and to  keep the architecture pure, we don't want to snoop  
on DNS, SIP, or  any other protocol that can give us a hint on where  
the host is  about to send a packet.

So the cost is *either* first packet loss or  sending the packet on the  
ALT using it as a request for a  mapping.

> 2 - LISP-ALT's long-path problem
>     http://psg.com/lists/rrg/2008/msg01676.html
>       http://www.antd.nist.gov/~ksriram/strong_aggregation.png
>
>   [Fix? Another fundamental problem in the architecture.   Could
>       be partially solved by more meshiness,  but that would greatly
>       increase the  complexity of the network and so raise more
>        scaling problems.]

Well this I believe is a duplicate of 1 above. So  it's not really  
*another* problem.

> 3 - Problems creating  a highly aggregated ALT network in order to
>    speed the  flow of packets up and down the hierarchy, while also
>     making the network robust against the failure of its routers and
>   tunnels.  This has not been discussed much on the RRG, but it  is
>    an obvious problem.

If we can do this with BGP,  which we have decades of existence proof  
why can't we do this with a  tunnel topology 1) where a tunnel can be  
rehomed much quicker and  easily than physical links and boxes we have  
to do with today,  allowing aggregation to occur at the edges of the  
ALT network, and  2) these tunnels stay up because there is robust  
connectivity below  the tunnel level to keep them up, hence there will  
be less  route-flaps for EID-prefixes.

I think it's the complete opposite of  what you claim. I think BGP will  
be more stable and scalable then  the underlying BGP. Plus, what we  
propose for the use-case of BGP  uses quite a few features of it.  
Recall this is eBGP over  GRE.

We have an infrastructure where we can 1) ping to see liveness  of  
nodes on the ALT. We have traceroute to determine the path a Map-  
Request takes on the ALT. We can ping/traceroute "underneath" so we   
can see the diameter of the tunnel. Not only do we have a solution  but  
we use existing rudimentary tools for management.

And the  address allocation hierarchy can map this logical ALT  
topology. And  if the Registries are involved in managing part of this  
ALT network  (which we hope and think they will), we can keep this  
consistent  relationship.

Do you realize the goodness of this? It's huge  Robin.

There's more too, you then throw SIDR in the mix and we have  secured  
the ALT, we have secured mappings, we created a PKI for  routing use.  
In fact, the first SIDR deployment could happen on the  ALT to be  
verified/experimented before it goes on the underlying  BGP.

And note, that the infrastructure will/does carry exactly 2  address- 
families. So we can do 6-to-6-over-4 with this approach. That  means we  
can get two site to be *IPv6-only* and be able to talk to  each other.

If you are a IPv6-only site now or a dual-stack site, you  could talk  
to the new IPv6 Google services. I think this is a pretty  huge feature.

This is clean and architecturally pure, no double NATs,  no CGNS, and  
no applications breaking.

> 4 - LISP-ALT's  Aggregation implies provider dependence.
>    This is  Christian Vogt's critique:
>     http://psg.com/lists/rrg/2008/msg00259.html

Not true. Aggregation here  is for the EID-prefix. Service providers do  
not carry EID-prefixes  in their cores so you don't depend on them. The  
decoupling of the  address creates this. The dependence is now on the  
ALT. And if your  site resides in a specific region of the world, you  
get your  EID-prefixes from that registry. So readdressing your domain  
would  only occur if you moved it from one region to another (let's  
leave  mobile ASes out of this for now).

> 5 - Path MTU Discovery  problems.  Despite Fred Templin, myself and
>    others  discussing the inherent PMTUD problems in any map-encap
>     proposal, there has been nothing from the LISP team to indicate
>   they have a solution.  They seem to think there is no  problem.

In section 5.4 of draft-farinacci-lisp-11.txt we describe two  proposed  
solutions, one is stateless, and the other is stateful. The  stateful  
creates no new table data structures but requires storing  an addition  
2-bytes of effective-MTU state per mapping.

>  6 - Lack of business case for LISP's Proxy Tunnel Routers:
>   http://psg.com/lists/rrg/2008/msg02021.html

You cannot fault a  technical design for a business case. A PTR is  
solving a technical  problem. And if we want to *truly* keep lots of PI  
type routes out  of the core *and* avoid NAT solutions which are just  
way too high in  opex, the PTR is the only solution we have to turn to.

And on the  contrary, I do believe service providers, interconnect  
providers,  registries, third-parties and even governments will provide  
PTR  services. Will they make a ton of money out of it, well that  
remains  to be seen.

> 7 - The scaling problems of potentially thousands of  ITRs each
>    probing reachability to one ETR, and likewise,  one ITR probing
>    reachability to many ETRs - this is one  view of the "Locator Path
>    Liveness Problem" of  draft-meyer-loc-id-implications-00.
>     http://www.irtf.org/pipermail/rrg/2009-January/000809.html

That is not  in the LISP design. Everyone just thinks it is.  ;-)

Dave and  Darrel's draft is providing a warning about how bad probing  
can be.  They do not take a position whether it should go into any  
proposal.  They are just saying, beware of the Frankenstein that may  
result and  can be an interpretation to not do probing at all.

Like I mentioned in  a previous RRG email message, one has to ask the  
question if an ITR  *should* switch from one RLOC to another when there  
*may* be a path  failure *somewhere in the middle of the network*.  
Please note my  very fine qualifications.

If we want to solve this problem, we could do  this today by having a  
host switch it's TCP connection to another A  record. This doesn't  
happen today because people deal with packet  loss, since it doesn't  
last long *and* rerouting actually works  quite well.

Van Jacobson always made this comment and I'll never forget  it, "The  
fact that the Internet drops packets is it's greatest  feature".

What else can either an xTR or TCP host do when sending  ICMP  
Unreachables are off by default in most modern routers, or they  are  
filtered by firewalls, and route aggregation hides failures  close to  
packet sources.

>> Unless these concerns are  adequately addressed, claiming that LISP
>> is an appropriate  solution to the problems discussed at the IAB's
>> October 2006  Routing and Addressing Workshop is nothing more than
>> a proof by an  emphatic assertion.
>
> I agree entirely.
>
> I  believe the LISP team could have made much better use of the RRG -
> by  participating fully in the debates resulting from these critiques.

We  were asked to do research in RRG. That was a reasonable request. So   
the research stuff in LISP has been and will continue to be  presented  
in RRG.

As for the engineering issues, the real  details and bits and bytes, we  
want a forum to discuss and work out  issues in an open forum. I've  
been going to IETF for 20 years now,  that forum is called a working  
group.

The working group  doesn't have to standardize what it is working on.  
And the charter  and the numerous requests we have made requests *for  
an experimental  working group*.

> Experiments won't help solve most of these  problems.  I am not
> against experimentation and I think it is  great that there is a LISP
> experimental network.
>
>  However, I would never have taken a proposal to the point of writing
>  code, running a network and inviting others to write compatible
>  implementations when the proposal had so many fundamental  problems.

There is constant implementation feedback back into the  design.  
Experienced engineers know how this cycle works. You start  with  
something you think can hold together, you try things out, you  refine,  
you revisit design, you go forward with implementation.  That's the  
process of *detailed* engineering.

For the old  timers, that was the difference between TCP/IP and OSI.

Sorry for the  long  email,
Dino

_______________________________________________
rrg  mailing  list
[email protected]
http://www.irtf.org/mailman/listinfo/rrg

_______________________________________________
rrg mailing list
[email protected]
http://www.irtf.org/mailman/listinfo/rrg

Re: [rrg] [Int-area] Please respond: Questions from the IESG as to whether a ...

Reply via email to