Re: [rrg] [Int-area] Please respond: Questions from the IESG as to whether a WG forming BOF is necessary for LISP

Dino Farinacci Sat, 24 Jan 2009 11:51:47 -0800

  LISP has too many fundamental problems to be considered a
  potentially practical solution.  Some of these problems require

LISP is not perfect and still evolving but let me point out that everydesign has fundamental problems. At this scale, it's very hard to beperfect Robin. And as Noel has stated so many times, the harddecisions must be made to incrementally deploy this. So things mightnot look pretty on the surface, but jeez, you got to do what you gotto do. Else, this will all be academic.

We really want to solve this problem, sincerely. This is not lipservice, we are not just writing papers, we want to rev the Internet,this is serious.

I want to address each of your 7 points below. I have cc'ed the [email protected]mailing list so the folks who want to focus on details of LISP cansee this thread.

1 - Delays in delivering initial packets in a flow.  This is either
   due to sending the packets along the ALT network (which takes
   time and involves sending substantial volumes of data packets
   over the ALT network, rather than just mapping requests) or to
   sending mapping requests only, and waiting for the ITR to get a
   response before it attempts to send traffic packets.

We have a memory/bandwidth tradeoff. So we have to make a hard designcall. I'd rather have mappings cached and timed out so they can beupdated when they need to then to hold all the possible mappings forthe Internet in an ITR.

There is no such thing of a free lunch. You either store all possiblemappings or you fetch them when you need them.

   According to: http://www.lisp4.net/docs/lisp-ausnog02.ppt pp 10
   & 11 the experimental LISP ALT network's ITRs drop initial
   packets and send brief map request messages.   Even if we think
   this delay doesn't matter, I am sure enough potential adopters
   would - and therefore make it difficult or impossible for LISP or
   any other such solution to be widely enough adopted to solve the
   routing scaling problem.

Then why was/is ARP deployed in a 10^4 host-based bridged network withbasically the same properties. First packet loss is not persistentwhen you look at all the traffic that originates from a source site toanother destination site.

The host vendors are probably thinking, if we do LISP in the host, youcould wait to send your TCP SYN before the mapping is available. Guesswhat, you don't send that TCP SYN before you get the DNS Reply. ;-) Iwonder why? Well I'll spell it out. If a host needs an address to makethe connection, we can say it also needs a locator to make a connection.

The design choice of LISP *is to just change software in CPE devices*to get the feature of decoupling rather than changing all the hosts ata site. That's a huge deployment advantage. So these CPE devices getpackets before they have mappings. We don't even want to consider ahost to CPE router signaling protocol and all it's complexities tosolve this and to keep the architecture pure, we don't want to snoopon DNS, SIP, or any other protocol that can give us a hint on wherethe host is about to send a packet.

So the cost is *either* first packet loss or sending the packet on theALT using it as a request for a mapping.

2 - LISP-ALT's long-path problem
     http://psg.com/lists/rrg/2008/msg01676.html
     http://www.antd.nist.gov/~ksriram/strong_aggregation.png

     [Fix? Another fundamental problem in the architecture.  Could
      be partially solved by more meshiness, but that would greatly
      increase the complexity of the network and so raise more
      scaling problems.]

Well this I believe is a duplicate of 1 above. So it's not really*another* problem.

3 - Problems creating a highly aggregated ALT network in order to
   speed the flow of packets up and down the hierarchy, while also
   making the network robust against the failure of its routers and
   tunnels.  This has not been discussed much on the RRG, but it is
   an obvious problem.

If we can do this with BGP, which we have decades of existence proofwhy can't we do this with a tunnel topology 1) where a tunnel can berehomed much quicker and easily than physical links and boxes we haveto do with today, allowing aggregation to occur at the edges of theALT network, and 2) these tunnels stay up because there is robustconnectivity below the tunnel level to keep them up, hence there willbe less route-flaps for EID-prefixes.

I think it's the complete opposite of what you claim. I think BGP willbe more stable and scalable then the underlying BGP. Plus, what wepropose for the use-case of BGP uses quite a few features of it.Recall this is eBGP over GRE.

We have an infrastructure where we can 1) ping to see liveness ofnodes on the ALT. We have traceroute to determine the path a Map-Request takes on the ALT. We can ping/traceroute "underneath" so wecan see the diameter of the tunnel. Not only do we have a solution butwe use existing rudimentary tools for management.

And the address allocation hierarchy can map this logical ALTtopology. And if the Registries are involved in managing part of thisALT network (which we hope and think they will), we can keep thisconsistent relationship.


Do you realize the goodness of this? It's huge Robin.

There's more too, you then throw SIDR in the mix and we have securedthe ALT, we have secured mappings, we created a PKI for routing use.In fact, the first SIDR deployment could happen on the ALT to beverified/experimented before it goes on the underlying BGP.

And note, that the infrastructure will/does carry exactly 2 address-families. So we can do 6-to-6-over-4 with this approach. That means wecan get two site to be *IPv6-only* and be able to talk to each other.

If you are a IPv6-only site now or a dual-stack site, you could talkto the new IPv6 Google services. I think this is a pretty huge feature.

This is clean and architecturally pure, no double NATs, no CGNS, andno applications breaking.

4 - LISP-ALT's Aggregation implies provider dependence.
   This is Christian Vogt's critique:
   http://psg.com/lists/rrg/2008/msg00259.html

Not true. Aggregation here is for the EID-prefix. Service providers donot carry EID-prefixes in their cores so you don't depend on them. Thedecoupling of the address creates this. The dependence is now on theALT. And if your site resides in a specific region of the world, youget your EID-prefixes from that registry. So readdressing your domainwould only occur if you moved it from one region to another (let'sleave mobile ASes out of this for now).

5 - Path MTU Discovery problems.  Despite Fred Templin, myself and
   others discussing the inherent PMTUD problems in any map-encap
   proposal, there has been nothing from the LISP team to indicate
   they have a solution.  They seem to think there is no problem.

In section 5.4 of draft-farinacci-lisp-11.txt we describe two proposedsolutions, one is stateless, and the other is stateful. The statefulcreates no new table data structures but requires storing an addition2-bytes of effective-MTU state per mapping.

6 - Lack of business case for LISP's Proxy Tunnel Routers:
   http://psg.com/lists/rrg/2008/msg02021.html

You cannot fault a technical design for a business case. A PTR issolving a technical problem. And if we want to *truly* keep lots of PItype routes out of the core *and* avoid NAT solutions which are justway too high in opex, the PTR is the only solution we have to turn to.

And on the contrary, I do believe service providers, interconnectproviders, registries, third-parties and even governments will providePTR services. Will they make a ton of money out of it, well thatremains to be seen.

7 - The scaling problems of potentially thousands of ITRs each
   probing reachability to one ETR, and likewise, one ITR probing
   reachability to many ETRs - this is one view of the "Locator Path
   Liveness Problem" of draft-meyer-loc-id-implications-00.
   http://www.irtf.org/pipermail/rrg/2009-January/000809.html


That is not in the LISP design. Everyone just thinks it is.  ;-)

Dave and Darrel's draft is providing a warning about how bad probingcan be. They do not take a position whether it should go into anyproposal. They are just saying, beware of the Frankenstein that mayresult and can be an interpretation to not do probing at all.

Like I mentioned in a previous RRG email message, one has to ask thequestion if an ITR *should* switch from one RLOC to another when there*may* be a path failure *somewhere in the middle of the network*.Please note my very fine qualifications.

If we want to solve this problem, we could do this today by having ahost switch it's TCP connection to another A record. This doesn'thappen today because people deal with packet loss, since it doesn'tlast long *and* rerouting actually works quite well.

Van Jacobson always made this comment and I'll never forget it, "Thefact that the Internet drops packets is it's greatest feature".

What else can either an xTR or TCP host do when sending ICMPUnreachables are off by default in most modern routers, or they arefiltered by firewalls, and route aggregation hides failures close topacket sources.

Unless these concerns are adequately addressed, claiming that LISP
is an appropriate solution to the problems discussed at the IAB's
October 2006 Routing and Addressing Workshop is nothing more than
a proof by an emphatic assertion.


I agree entirely.

I believe the LISP team could have made much better use of the RRG -
by participating fully in the debates resulting from these critiques.

We were asked to do research in RRG. That was a reasonable request. Sothe research stuff in LISP has been and will continue to be presentedin RRG.

As for the engineering issues, the real details and bits and bytes, wewant a forum to discuss and work out issues in an open forum. I'vebeen going to IETF for 20 years now, that forum is called a workinggroup.

The working group doesn't have to standardize what it is working on.And the charter and the numerous requests we have made requests *foran experimental working group*.

Experiments won't help solve most of these problems.  I am not
against experimentation and I think it is great that there is a LISP
experimental network.

However, I would never have taken a proposal to the point of writing
code, running a network and inviting others to write compatible
implementations when the proposal had so many fundamental problems.

There is constant implementation feedback back into the design.Experienced engineers know how this cycle works. You start withsomething you think can hold together, you try things out, you refine,you revisit design, you go forward with implementation. That's theprocess of *detailed* engineering.


For the old timers, that was the difference between TCP/IP and OSI.

Sorry for the long email,
Dino

_______________________________________________
rrg mailing list
[email protected]
http://www.irtf.org/mailman/listinfo/rrg

Re: [rrg] [Int-area] Please respond: Questions from the IESG as to whether a WG forming BOF is necessary for LISP

Reply via email to