draft-quinn-vxlan-gpe-03

Tom Herbert Fri, 01 Aug 2014 10:12:24 -0700

On Fri, Aug 1, 2014 at 9:24 AM, Dino Farinacci <farina...@gmail.com> wrote:
>> Hello Dino,
>>
>> always interesting, different people interpret differently.
>>
>> You seem to talk about controlling where the OAM data flows. I don't think
>> this is what Eric means. As long as a flow is defined in the traditional 3-
>> or 5-tuple way from the IP/UDP packet and as long as every router is mapping
>> the same flow onto the same ECMP egress interface we are fine. The details -
>> the exact algorithm, if the algorithm uses the router-id as an additional
>> hash or whatever else - don't matter. The OAM packet would use the same
>> IP/UDP header as the data packet it's supervising. And the receiver processes
>> OAM packets as "OAM" because of the flag set.
>
> For overlays, if you want to do RLOC selection at the encapsulator based on 
> the underlay's paths, you will need to know all paths so you can have 
> granular decision control.
>
> I didn't mean to emphasize the algorithm being used at each hop. There is a 
> combinatorial problem by looking at all paths. For example say one hop has 
> ECMP paths (A, B, C) and the next one has (A', B', C'). For different 
> 5-tuples you can have A -> A', A -> B', A -> C',  B -> A', B -> B', B -> C', 
> and C -> A', C -> B', C-> C'. And if the same algorithms are not used, you 
> may have a 1-to-many relationship from hop 1 to hope 2.
>
> If an OAM mechanism is going to measure attributes of a path, it will to do 
> so for all paths, And the number of combinations above was just in 2 boxes 
> with only 3 ECMPs each.
>
> This quickly raises my complexity flag.
>
Definitely, the combinatorial problem has made concurrently probing
all paths infeasible. The solution to detecting bad paths has been to
get feedback to the application from the transport (e.g. TCP) for
packet loss or excessive RTT for specific flows. The response to a bad
path is simply reopen the connection and hope for a better path (by
virtue of a different 5-tuple hash for the new connection).


For encapsulation there is good news and bad news. When we're
encapsulating over UDP we can affect the path simply by twiddling the
source port a little, so having application reopen connection might
not be necessary response to bad path. Bad news is that it may be
infeasible to get the feedback from a third party guest about path
quality. If the encapsulation layer wants to perform path selection
somehow inband signaling and per flow might be needed at that layer.

>>> One needs to argue if you really need the granuarlity for the complexity
>>> that will needed to get this partially correct.
>>
>>> Well I think LISP RLOC-probing is good enough, but I am biased.  ;-)
>>
>> you are "rich" by having an additional control plane UDP port ;-)
>> I don't see much complexity although I agree that just knowing if the RLOC is
>> alive and ping-able may be enough in many cases. If we can get more like
>> in-band OAM - why not?
>
> You can inband OAM for the control-port too.
>
>>> If an ITR sends a packet the ETR's address, the middle boxes do not know if
>>> it is a control-packet versus a data-packet.
>>
>> true but e.g. for LISP, while the middle box may have no idea what 4341 and
>> 4342 as udp ports mean, it could still calculate a different hash bucket for
>> ECMP due to the different UDP port.
>
> The hash is modulo the number of ECMP paths. You build the control packets 
> with 5-tuples that generate hashs across all ECMP paths.
>
>> Hmm, my statement is so trivial - I assume you want to say something
>> different with your reply?
>>
>>
>>> I am trying to avoid problems. Seems like things are being over-engineered.
>>> Again.
>>
>> I would say the echo nonce in RFC6830 is not fundamentally different from
>> OAM. The N+E flags trigger some activity on the receiver, similar to OAM.
>
> It is way different. All we are testing is the forward path from ITR to ETR 
> (and nothing else). OAM proposals also meausre other attributes of the path.
>
>>
>>> P.S. Sorry I keep being negative. And if one person says shut up, I'll stop
>>> posting.
>>
>> well, everyone is entitled to his/her opinion and to voice it in his/her
>> personal style. You have a point as I was a bit upset myself about this
>> draft: there is likely more about it, otherwise why would we discuss
>> yet-another-8-byte-header? Fabio offered the simplification the new header
>> offers but for a while we will have 2-3 slightly different headers that need
>> active support in the code or hardware. Actually in terms of code - be it C
>> or Verilog - I'm not sure there is any simplification ever over VxLAN + LISP.
>
> Well if one wants the VXLAN and LISP header to be consistent, it was already 
> that way from the start. And if you wanted to run L3 with a VXLAN header, you 
> use the destination MAC as the xTR's address and you got L3. There is no 
> demux needed because the inner ethernet header has an ethertype that can 
> demux every protocol ever invented in the world.
>
> Do not create a new registry for types authors of GPE. PPP learned this in 
> 1989 and used ethertypes. Why is that different now?
>
> If there is an ethertype allocated for "ethernet" there can be one allocated 
> for NSH. And then if anyone ever wanted to run NSH directly over ethernet, 
> you have your ethertype.
>
> But I AM NOT RECOMMENDING THIS. I believe NSH should run on top of UDP and 
> get its own port number. NSH sends UDP packets, period.
>
>> But at least the situation of having both VxLAN and LISP can be simplified by
>> having a common umbrella and one common discussion.
>
> Agree.
>
>> Personally I think VxLAN-gpe is how the VxLAN/LISP header could have looked
>> like from the start (hindsight is great, I know) and I don't have a technical
>> problem with the draft itself. What is missing is enough context to discuss
>> it. E.g. I'm still not sure why there is a P flag, if for a hard technical
>> reason or for the aesthetics that every field is controlled by a turn-on flag
>> ;-)
>
> There is a P-flag so you have demux ability in the VXLAN/LISP header. So we 
> will demux at the UDP port level, the P-bit level, and the ethertype level. 
> All these will be demux decisions a forwarder has to deal with. This is quite 
> ridiculous.
>
>> So I encourage and kindly ask the authors to provide more of this context in
>> the next draft version.
>>
>> Regards, Marc
>
> Dino
>
>>
>>
>>
>> On Thu, 31 Jul 2014 16:16:53 -0700, Dino Farinacci wrote:
>>>> Dino,
>>>>
>>>> Would you re-phrase your response?  I am having some trouble parsing it,
>>>> so
>>>> I must be missing something.
>>>>
>>>> First, I think (when you said "... sent from any pair of ports ...") you
>>>> meant to
>>>> say "... sent with any pair of ports ..."  - but this is a guess.
>>>
>>> Yes "with" is a better way of stating it.
>>>
>>>> As for making OAM messages traverse the exact same path as data, this is
>>>> what OAM is expected to do.  In essence, if data follows a path that
>>>> involves
>>>
>>> Good luck. I do not how you will be able to control each ECMP path at each
>>> path across different vendors as well as the same vendor with different
>>> hashing algorithms.
>>>
>>> One needs to argue if you really need the granuarlity for the complexity
>>> that will needed to get this partially correct.
>>>
>>>> a non-zero number of gates, while OAM does not, the successful delivery of
>>>> OAM is only an approximate indication of the data-path integrity.  Any H/W
>>>> that data has to go through, and OAM does not go through, could fail and we
>>>> would see an OAM indication of a valid path through which data either would
>>>> not go, or would be diverted in some unexpected way.
>>>
>>> Well I think LISP RLOC-probing is good enough, but I am biased.  ;-)
>>>
>>>> Ordinarily, this should not be a problem for the hardware, as (ordinarily)
>>>> the
>>>> OAM is indistinguishable from data.  The hardware works no harder to push
>>>> OAM than it would to push an equivalent amount of data.
>>>
>>> If an ITR sends a packet the ETR's address, the middle boxes do not know if
>>> it is a control-packet versus a data-packet.
>>>
>>>> So, what is the problem again?
>>>
>>> I am trying to avoid problems. Seems like things are being over-engineered.
>>> Again.
>>>
>>> Dino
>>>
>>> P.S. Sorry I keep being negative. And if one person says shut up, I'll stop
>>> posting.
>>>
>>>>
>>>> --
>>>> Eric
>>>>
>>>> -----Original Message-----
>>>> From: nvo3 [mailto:nvo3-boun...@ietf.org] On Behalf Of Dino Farinacci
>>>> Sent: Wednesday, July 30, 2014 9:13 PM
>>>> To: Larry Kreeger
>>>> Cc: Tom Herbert; David Melman; Marc Binderberger; LISP mailing list list;
>>>> nvo3@ietf.org
>>>> Subject: Re: [nvo3] Comments on
>>>> http://tools.ietf.org/html/draft-quinn-vxlan-gpe-03
>>>>
>>>>> I'm assuming that routers and switches will be multipathing based on
>>>>> the UDP port numbers, so I would expect different destination UDP
>>>>> ports to take different equal cost paths.
>>>>
>>>> Well if OAM is going to be effective, messages need to be sent from any
>>>> pair of ports that yield 0 through N modulus so multiple paths can be
>>>> determined. So it doesn't matter with the port number values  you use,
>>>> those control packets will be ECMPed as well.
>>>>
>>>> If you are also inferring that you want the OAM packets to go through the
>>>> same data-path of each device on the path, then you will have to put TLVs
>>>> in the data path, which is traditionally not prudent. See my Puneet
>>>> reference from previous email.
>>>>
>>>> Dino
>>>>
>>>> _______________________________________________
>>>> nvo3 mailing list
>>>> nvo3@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/nvo3
>>>
>

_______________________________________________
nvo3 mailing list
nvo3@ietf.org
https://www.ietf.org/mailman/listinfo/nvo3

Re: [nvo3] Comments on http://tools.ietf.org/html/draft-quinn-vxlan-gpe-03

Reply via email to