Mike,
See below...
Thanks,
Lou
On 10/28/2013 08:56 PM, Mike Sullenberger (mls) wrote:
> Lou,
>
> Thanks, again answer inline :-).
>
> Mike.
>
> Mike Sullenberger, DSE
> [email protected] .:|:.:|:.
> Customer Advocacy CISCO
>
>> -----Original Message-----
>> From: Lou Berger [mailto:[email protected]]
>> Sent: Thursday, October 24, 2013 8:57 AM
>> To: Mike Sullenberger (mls)
>> Cc: IPsecme WG; [email protected]
>> Subject: Re: [IPsec] Some comments on draft-detienne-dmvpn-00
>>
>> Hi Mike,
>>
>> Thanks for the response. See below...
>>
>> On 10/23/2013 2:54 PM, Mike Sullenberger (mls) wrote:
>>> Lou,
>>>
>>> Thank you for your comments, more inline.
>>>
>>> Mike.
>>>
>>> Mike Sullenberger, DSE
>>> [email protected] .:|:.:|:.
>>> Customer Advocacy CISCO
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: Lou Berger [mailto:[email protected]]
>>>> Sent: Friday, October 18, 2013 3:29 PM
>>>> To: [email protected]
>>>> Cc: IPsecme WG
>>>> Subject: Some comments on draft-detienne-dmvpn-00
>>>>
>>>> Hi,
>>>> I have the following comments/questions on the draft:
>>>>
>>>> - Why allow IPsec tunnel mode? Is there a case where it provides some
>>>> value?
>>>
>>> [Mike Sullenberger]
>>> You are correct that IPsec tunnel mode is not really needed and
>>> transport mode is far recommended and preferred. There are a couple of
>>> rare situations where tunnel-mode could help, like when separating the
>>> GRE/NHRP tunneling and IPsec encryption onto separate nodes, these
>>> cases are not recommended, but we thought we should leave in the
>>> possibility.
>>>
>>
>> How does this work?
>>
>> As I understand it, NHRP is run inside the GRE tunnel so this means that the
>> IPsec and GRE endpoints now need some way to communicate (for e.g.,
>> public ("NBMA") addresses and GRE/IPsec tunnel creation/removal
>> coordination).
>>
>> Am I missing something?
>>
>
> [Mike Sullenberger]
> You have this correct, and basically we don't coordinate the NHRP and
> IPsec databases in this case. It is expected that the IPsec node
> will create IPsec SAs on demand due to GRE tunnel traffic and will
> remove them when there is no more GRE tunnel traffic. This is
> basically why we don't recommend this type of setup, but since I
> don't know the future I hate to remove this possibility so that is
> why supporting IPsec tunnel-mode is a "MAY" in the draft.
>
Sounds to me like this would be a different, and perhaps
non-interoperable modes/solutions. Seems to me tunnel mode should be
left out until the full solution is specified...
>>
>>>> - Do you want to recommend omitting the GRE checksum?
>>>
>>> [Mike Sullenberger]
>>> Good idea, we definitely don't use the GRE checksum, and I don't think
>>> it provides any value so we should recommend omitting it. I think
>>> this is also the case for most of the other optional GRE header
>>> attributes.
>>
>>
>>> Though, the GRE Tunnel Key must be allowed, handled and provided.
>>>
>>
>> I missed that the tunnel key MUST be "provided". Can you elaborate on
>> this? I only see one oblique reference to it in the draft.
>
> [Mike Sullenberger]
> For a node to support a single DMVPN you don't need a GRE tunnel key
> and if the node supports multiple DMVPNs and each one has a different
> tunnel source (NBMA) address, then again you don't need a GRE tunnel
> key. Only if a single node is supporting multiple DMVPNs and is
> using the same tunnel source (NBMA) address on them then you would
> need a something to differentiate the GRE tunnel packets and
> logically that would be the tunnel key. So given that perhaps
> supporting a GRE tunnel key would be a "SHOULD" or "MAY".
> This could
> be an implementation detail, but we could still clarify this in the
> draft.
Clarifying it makes sense, particularly as it's applicability/need
differs based on usage environment.
>>
>>>> - I think the draft should discuss what happens when the best route
>>>> moves from one spoke to another spoke. Both the cases where the
>>>> host/prefix is still reachable via the original spoke and when it
>>>> isn't should be covered. As should avoiding blackholes, and any periods
>> of suboptimal forwarding.
>>>
>>> [Mike Sullenberger]
>>> We can add some more discussion around this point. The main feature
>>> here is that this is handled by the routing protocol that is running
>>> over the DMVPN tunnels.
>>
>> For clarification: routing runs over the configured DMVPN topology, but not
>> shortcuts, right?
>
> [Mike Sullenberger]
> That is correct. The reason that we don't run the routing protocol
> over the shortcuts, is because the routing protocol has constant
> traffic (hellos) which would tend to keep the shortcuts up and these
> packets are a bit of a pain to filter out if needed. Also in this
> case the routing protocol on the spokes would tend to get many
> routing neighbors which could overwhelm the spoke, CPU-wise, and
> overwhelm the routing protocol with too many forwarding options.
>
I think the dynamic adjacency issue could also be problematic for some
routing protocols (and/or implementations) so this is good, but it does
come at the cost of NHRP needing to carry / indicate all routes
reachable over a shortcut.
>>
>>> The routing protocol will redirect the data packets via different
>>> tunnels and/or spokes as the routing is updated.
>>
>> I think your answer to my next question actually helps answer this one.
>>
>> So Section 6.1 of RFC2332 says:
>> ... In such
>> circumstances, NHRP should not be used. This situation can be
>> avoided if there are no "back door" paths between the entry and
>> egress router outside of the NBMA subnetwork. Protocol mechanisms to
>> relax these restrictions are under investigation.
>>
>> Do you believe this restriction still applies?
>
> [Mike Sullenberger]
> I do not believe that this restriction applies. I have never seen an
> issue along these lines where there wasn't a mis-design or
> mis-configuration of the routing protocol. In DMVPN, NHRP uses the
> routing protocol as the final source of truth for destinations
> reachable through the VPN. If the routing protocol changes routing to
> now not route packets through the VPN then NHRP needs to clear any
> routing/forwarding/shortcuts that would forward to that destination
> via the VPN. I think the mechanism to detect something like this is
> implementation dependent, but we probably should mention that such a
> mechanism may/should be provided.
>
Agreed. Also, I think you have to address the quoted text directly or
assume that it applies here too.
>>
>>> Note, if static routing is used then you lose this capability.
>>
>> Sure.
>>>
>>>> - I think the draft is missing a description of how/when NHRP Purges
>>>> are used, e.g., resulting from interactions with routing. (Yes there
>>>> is an overlap with the above, but it depends a bit on your solution.)
>>>
>>> [Mike Sullenberger]
>>> As you have noted, NHRP purges are used to keep the distributed NHRP
>>> database in sync. If a local node loses access to a destination that
>>> it has previously replied with itself as the egress point in an NHRP
>>> mapping (and the entry hasn't timed out yet) then it will generate an
>>> NHRP purge and send a copy to each requester (recorded when it sent
>>> the original reply). This will then clear out these now invalid
>>> mapping entries on the remote nodes and trigger them to find an
>>> alternate path if available. Note, this is basically what is
>>> described in NHRP (RFC2332), we didn't really want to duplicate this
>>> in this RFC, but it could be added.
>>>
>>
>> okay, I think this comes back to where the draft says:
>>> In this document, we will depart the
>>> underlying notion of a centralized NHS.
>>
>> I think the part that's missing (or perhaps I just missed) is an explicit
>> statement that an egress must follow the NHS procedures related to any
>> issued Resolution Reply.
>>
>
> [Mike Sullenberger]
> I think this is a place where we need to point to the NHRP RFC to follow for
> these procedures and any differences if any.
> Is this what you are getting at?
>
Yes, that certainly would address my comment.
>>>> - I the draft should discuss the NHRP scaling considerations that are
>>>> important in implementation and deployment/operation. (Basically the
>>>> solution is proposing network wide ARP.) You already have a teaser
>>>> on this when you mention rate limiting.
>>>
>>> [Mike Sullenberger]
>>> We can certainly do so. In DMVPN deployments we haven't really found
>>> that NHRP scaling has been an issue. Usually we ran into either
>>> Routing protocol or IPsec scaling issues first. It is correct we do
>>> mention a couple of places about rate limiting triggering or sending
>>> of NHRP messages, mainly because it wasn't felt that it was useful to
>>> continue to "bother" another node if it was working on the previous
>>> request, which was the same as the one to be sent again. Note, we do
>>> have mechanisms for retransmission and back-off of messages. Again
>>> some of this is covered in the NHRP RFC.
>>>
>>
>> I guess it all depends on the number of routes in the system and
>> reachability pattern at a particular spoke. I think when both are large, the
>> use of per prefix soft state refreshes will prove problematic. I'm a bit
>> surprised you've run into routing protocol scaling issues though, but that's
>> certainly out of scope.
>
> [Mike Sullenberger]
> Routing protocol scaling is only seen on the Hubs (NHSs), since they
> are basically the route reflectors for the spokes. We don't normally
> see any routing issues on the spokes. Spokes in general don't build
> that many shortcuts and therefore don't have many per prefix
> refreshes. Though in cases where a spoke may build out a 1000
> shortcuts, the refreshes since they are usually on the order of 10s
> of minutes apart don't cause a scaling issue on the spoke. Also such
> a spoke is normally a larger box, since it has to handle 1000+
> ISAKMP/IPsec SAs and the presumably the traffic that goes with it.
>
This all makes sense in simple topologies which match the RFC stated
limitation and also are probably the most common. All good stuff for a
scaling considerations section.
How are you avoiding blackholes on shortcut failures that don't match
any route/router changes? Are you assuming any keepalives/failure
detection on the shortcuts?
>>>> - NIT/editorial: If section 4 is your "Solution Overview", where is
>>>> the solution specification? More seriously, I found parts of this
>>>> section more of a narrative of an example than a protocol specification.
>>>
>>> [Mike Sullenberger]
>>> Yes, I think this needs to be cleaned up. Since a lot of what we do
>>> with NHRP is covered in the NHRP RFC we didn't want to duplicate too
>>> much here, but we can certainly make a more clear solution
>>> specification. I think the solution overview section is also useful
>>> since, a walk through can help people to understand how the solution
>>> is intended to work. Many times I find it hard with just solution
>>> specifications to get a real feel for how things go together.
>>>
>>
>> Avoiding duplicate specification is certainly goodness, but I think some
>> additional pointers to when standard NHRP procedures are to be followed
>> (or not) would be a valuable addition as part of the cleanup.
>
> [Mike Sullenberger]
> Yes, we can add in more guidance along these lines.
>
> Your comments are a great help in solidifying what needs to be done for the
> next and following drafts.
>
Thanks,
Lou
> Thanks,
>
> Mike.
>
>>
>> Much thanks,
>> Lou
>>
>>>> - NIT: Assuming the Indirection Notification described in section 4.3
>>>> is the same as the NHRP Traffic Indication covered in 5.1, can you
>>>> align the names and fix the reference in 4.3?
>>>
>>> [Mike Sullenberger]
>>> Yes, thanks for noting this. We tend to use those terms interchangeably,
>>> but we should be more consistent here.
>>>
>>>>
>>>> Thanks,
>>>> Lou
>>> _______________________________________________
>>> IPsec mailing list
>>> [email protected]
>>> https://www.ietf.org/mailman/listinfo/ipsec
>>>
>>>
>>>
>>>
>
_______________________________________________
IPsec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/ipsec