Joel, I agree TCP reliability != application delivery knowledge but TCP reliability does ensure the message will *eventually* get delivered to the application (unless there's an application bug) without additional application reliability logic. I am sure there are applications that require/expect the ACK to be sent from the application layer to avoid the time delta between transport reception and application reception seen in TCP but I don't think LISP needs that level of differentiation.
My previous example simply states that with congestion control the ETR would not end up keep re-tx if the MS is congested. In a scale scenario re-tx can potentially cause the system to stuck in this state indefinitely. -Johnson > On Dec 6, 2017, at 12:53 PM, Joel M. Halpern <[email protected]> wrote: > > Sure, if the TCP queue backs up far enough that an effort to write by the > application fails, the xTR can stop then. But until that point, it has no > way of knowing whether the registration got through or not. TCP reliability > != application delivery knowledge. > > There may well be good reasons for wanting to use TCP for this exchange. > There often are. But the example you gave in this case is arguably a > counter-example. > > Yours, > Joel > > On 12/6/17 3:25 PM, Johnson Leong (joleong) wrote: >> Joel, >> If there's flow control between the xTR and MS then the xTR wouldn't attempt >> to send the register if the MS cannot keep up. Depending on your >> implementation the xTR can stop building registration message until flow >> control is off with the MS. >> -Johnson >>> On Dec 6, 2017, at 12:17 PM, Joel M. Halpern <[email protected]> wrote: >>> >>> <speaking without hats.> >>> Johnson, I think your example may cause more problems. >>> I am not sure what you mean by "over-subscribed". >>> But if the xTR sends a registration over TCP, and gets no answer, it is >>> going to asume that it is properly registerd. If the registration has not >>> been received due to the receiver not reading his TCP queue (because he is >>> over-subscribed) that is a bad result. >>> >>> Yours, >>> Joel >>> >>> On 12/6/17 2:44 PM, Johnson Leong (joleong) wrote: >>>> Dino, >>>> What we're trying to convey is that by using TCP it allows us to achieve >>>> reliability which significantly simplifies the code and error handling. >>>> One can achieve the same with UDP with ack and retry but it would be >>>> reimplementing what TCP already offers and this added complexity typically >>>> makes the code error prone. >>>> Another benefit with TCP is congestion control, with UDP if we send a >>>> map-register and we don't get an ack then we retry. What if the MS is >>>> fully subscribed then this can lead to constant retry. You can impose >>>> backoff mechanism but ultimately this would affect convergence. >>>> -Johnson >>>>> On Dec 6, 2017, at 11:17 AM, Dino Farinacci <[email protected]> wrote: >>>>> >>>>>> Dino, >>>>>> >>>>>>> LISP (the application), does not know that itself, the xTR, is in sync >>>>>>> with the map-server. The packets can be in flight or being >>>>>>> retransmitted due to loss. But if a Map-Register is sent with a nonce >>>>>>> and no Map-Notify is returned the xTR knows for sure the two are in >>>>>>> sync. >>>>>> >>>>>> An application (and LISP in this case) should always be able to know the >>>>>> state of a (TCP) socket that it has opened with a server. I’m not >>>>>> entirely sure why we would not want to use this information. >>>>> >>>>> All an app knows is the socket it has opened and any port it has >>>>> bind()’ed to. It doesn’t know the connection state in terms of what >>>>> packets have been sent and what has been ack’ed. There IS NOT enough >>>>> information to do anything useful. >>>>> >>>>>> Besides, the reliable transport session does not invalidate the use of >>>>>> nonces and Map-Notifies as an indication that the MS has completely >>>>>> received the information, the just rely on the TCP state to know that >>>>>> nothing has changed. >>>>> >>>>> Right, so you have duplicate functionality. That isn’t efficient. >>>>> >>>>>>> I’d argue you may it worse. TCP does provide reliability but so does >>>>>>> LISP itself. And the only reason the messages are periodic is because >>>>>>> the spec said to send every 1 minute and timeout every 3 minutes. You >>>>>>> can make it 1000 minutes and timeout every 3000 minutes. >>>>>> >>>>>> And this sounds as we would make the protocol very complicated to use >>>>>> (or code), as this would lead us to have to code/configure a specific >>>>>> registration pattern/logic for every use case that we want to support. >>>>>> Not saying we cannot, but it sounds like re-implementing TCP ;) >>>>> >>>>> You already have that code or you aren’t implementing Map-Register acks >>>>> corrrectly. And changing a timer is really a constant change in the code. >>>>> And what is spec’ed today in LISP proper is much simpler than TCP, it is >>>>> not reimplementing it. >>>>> >>>>> But let's not drift from the point. The spec is written in a general way >>>>> to solve only sending Map-Registers reliably. I suggest the text not >>>>> mislead the reader to think this is a general new packet format for >>>>> anything. >>>>> >>>>> When people judge which protocols they want to deploy, the people that do >>>>> the due diligence will look at the protocol specs and see how much >>>>> mechanism is designed in. And they make judgement decisions about using >>>>> the protocol. >>>>> >>>>> I know of at least 2 vendors that said “I implemented pages 47-50 of the >>>>> EVPN spec”. ;-) >>>>> >>>>> In LISP. we want to make sure what we spec is what is used and not >>>>> ignored or considered optional. So please document the specific use-case >>>>> you want to implement. And I’d suggest making the draft name >>>>> draft-*-lisp-reliable-registers. >>>>> >>>>>>> LISP can pack all those EID-records in a Map-Register just like TCP >>>>>>> does. And if you want per nonce acks, you pack them in IP packets <= >>>>>>> 65535 bytes. TCP will have to o that as well. >>>>>> >>>>>> This is exactly the point, while LISP signaling allows it we don’t need >>>>>> to re-implement every TCP feature in LISP, as TCP can already provide it. >>>>> >>>>> You don’t need to. >>>>> >>>>>>> And guess what. What if there is an RLOC-change and you already gave >>>>>>> the last one to TCP and can’t pull it back. If you were waiting for an >>>>>>> ack and a new RLOC-change came in (during a lossy case), you wouldn’t >>>>>>> have to retransmit the old information wastily. So keep the >>>>>>> “retransmission queue” in LISP has its advantages. >>>>>> >>>>>> I’m not sure this is so easy. UDP, just like TCP uses the RLOC (IP) as >>>>>> part of the “session identifier” and the nonce is per-packet, not >>>>>> per-session. The moment the RLOC changes on the xTR, the MS does not >>>>>> know that the xTR is the same so we’d need a retransmission process. >>>>> >>>>> I’m not talking about when the xTR changes, I’m talking about an address >>>>> on an interface on the same xTR changes since it was DHCP’ed to the xTR >>>>> or behind a NAT. But for the LISP-MN case, the xTR is moving and its >>>>> RLOCs are changing. You couple this with pubsub and the extra >>>>> Map-Registers with old RLOCs has a ripple effect where then the >>>>> Map-Server sends Map-Notify messages to all the subscribers, then >>>>> followed by another set of Map-Notifies with the new RLOC-set. That is a >>>>> lot of (unnecessary) messaging and processing. >>>>> >>>>> We have to think about the implications of any one draft on the ENTIRE >>>>> LISP architecture. It must work efficiently as one holistic distributed >>>>> system. >>>>> >>>>> Dino >>>>> >>>>>> >>>>>> Marc >>>>>> >>>>>> On 12/5/17, 5:57 PM, "Dino Farinacci" <[email protected]> wrote: >>>>>> >>>>>>> Dino, >>>>>>> >>>>>>> In addition to the previous arguments there are particular use-cases >>>>>>> where the use of reliable transport simplified the deployment of LISP. >>>>>> >>>>>> I understand its advantages. I am examining its costs. >>>>>> >>>>>>> As an example, the moment we started scaling datacenters to support 10s >>>>>>> of thousands of hosts, the use of a reliable transport helped a lot the >>>>>>> management of scale: >>>>>>> On one side it reduces the amount of signaling when nothing changes, >>>>>>> since we use TCP state as an indication that xTRs and the MS are in >>>>>>> sync and there is no need to deal with the optimization of the refresh >>>>>>> logic (periodic or paced). >>>>>> >>>>>> LISP (the application), does not know that itself, the xTR, is in sync >>>>>> with the map-server. The packets can be in flight or being retransmitted >>>>>> due to loss. But if a Map-Register is sent with a nonce and no >>>>>> Map-Notify is returned the xTR knows for sure the two are in sync. >>>>>> >>>>>> I’d argue you may it worse. TCP does provide reliability but so does >>>>>> LISP itself. And the only reason the messages are periodic is because >>>>>> the spec said to send every 1 minute and timeout every 3 minutes. You >>>>>> can make it 1000 minutes and timeout every 3000 minutes. >>>>>> >>>>>> So let’s keep periiodic overhead, reliability, and staying in sync as >>>>>> separate issues. >>>>>> >>>>>>> On the other side, with reliable transport we offload the reliable >>>>>>> delivery of information (and congestion control) >>>>>> >>>>>> I understand that. But you can’t say TCP is keeping you in sync, >>>>>> because you have removed detail from the applicationis. >>>>>> >>>>>>> from LISP to another process (TCP) that is entirely devoted and >>>>>>> designed for this. For example, supporting events like mass VM moves >>>>>>> relying purely on LISP based ACks became very challenging, as we ended >>>>>>> up having to deal with congestion events related to the signaling load >>>>>>> generated. The use of the reliable transport largely simplified the >>>>>>> problem. >>>>>> >>>>>> Dino >>>>>> >>>>>>> >>>>>>> Marc >>>>>>> >>>>>>> On 12/5/17, 12:06 PM, "lisp on behalf of Johnson Leong (joleong)" >>>>>>> <[email protected] on behalf of [email protected]> wrote: >>>>>>> >>>>>>> Hi Dino, >>>>>>> >>>>>>> A large portion of this draft discusses the state machine required for >>>>>>> TCP and how to ensure the MS and xTR are in sync. We literally reuse >>>>>>> the entire UDP map-register code, we just wrap that message around the >>>>>>> LISP TCP header so there's a lot of code reuse. Finally, this draft is >>>>>>> not meant to replace UDP register but in some of our use cases TCP >>>>>>> would scale better to avoid the periodic registration. >>>>>>> >>>>>>> -Johnson >>>>>>> >>>>>>>> On Dec 5, 2017, at 10:52 AM, Dino Farinacci <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> registration protocol, that might be orthogonal to other >>>>>>>>> transport-related mechanisms. In my experience this has proved to be >>>>>>>>> very effective in scalability of large LISP deployments, especially >>>>>>>>> with the increased volume of registration data. >>>>>>>> >>>>>>>> I agree it’s a point solution for registration. Then why did you need >>>>>>>> to have a general format. >>>>>>>> >>>>>>>> I could support this draft if it was simplified to spec how to use >>>>>>>> Map-Registers in TCP and nothing more. >>>>>>>> >>>>>>>> The only thing I would add is how to use TLS so encryption is >>>>>>>> supported. More and more requirements are coming up for protecting the >>>>>>>> privacy of location information. And since Map-Registers carry RLOCs >>>>>>>> (and potential Geo-Coordnates) that information needs to be protected. >>>>>>>> >>>>>>>> Dino >>>>>>>> _______________________________________________ >>>>>>>> lisp mailing list >>>>>>>> [email protected] >>>>>>>> https://www.ietf.org/mailman/listinfo/lisp >>>>>>> >>>>>>> _______________________________________________ >>>>>>> lisp mailing list >>>>>>> [email protected] >>>>>>> https://www.ietf.org/mailman/listinfo/lisp >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> _______________________________________________ >>>> lisp mailing list >>>> [email protected] >>>> https://www.ietf.org/mailman/listinfo/lisp _______________________________________________ lisp mailing list [email protected] https://www.ietf.org/mailman/listinfo/lisp
