And here's the reply:
--
Travis,
Thank you for this insight as it was able to help me to solve the issue.
After staring into the logs with the help of grep, I was able to learn the
underlying mechanisms (instead of just copying and pasting my racoon.conf).
Within racoon.conf, the "lifetime" in the "remote" directive is setting
the duration of the IKE phase 1 session. This session will only be renewed
(aka "renegotiated" in the logs) if there is an underlying IPsec phase 2
session. Here's the relevant text from syslog during a successful renewal:
Apr 8 07:06:05 xxxx racoon: INFO: renegotiating phase1 to x.x.x.x due to
active phase2
Apr 8 07:06:05 xxxx racoon: oakley_dh_compute(MODP1024): 0.002476
Apr 8 07:06:05 xxxx racoon: phase1(ident I msg3): 0.010048
Apr 8 07:06:05 xxxx racoon: phase1(ident R msg3): 0.001543
Apr 8 07:06:05 xxxx racoon: phase1(Identity Protection): 0.054305
The IKE phase 2 session lifetime is set in the "sainfo" directive. I had
set both phase 1 and phase 2 lifetimes to an identical value of two hours.
Historically, this symmetry wasn't an issue because phase 1 would take
some nontrivial amount of time and this would build in a gap in the
expiration times. But my hardware (delicious Avoton C2750s) have reduced
both calculations to near zero and this was causing both phases to expire
simultaneously. Here's the syslog from an event that dropped traffic:
Apr 8 06:34:13 xxxx racoon: INFO: ISAKMP-SA expired
Apr 8 06:34:13 xxxx racoon: INFO: ISAKMP-SA deleted
Apr 8 06:34:13 xxxx racoon: INFO: IPsec-SA expired: ESP/Transport
Apr 8 06:34:13 xxxx racoon: INFO: IPsec-SA expired: ESP/Transport
I have solved the problem by renewing phase 1 only every 24 hours (off
business hours) with phase 2 renewing much more frequently and with a
lifetime that is not a divisor of the phase 1 lifetime (leap seconds and
millennia notwithstanding). I am going to try Diffie-Hellman group 5 since
that seems like low-hanging fruit from a security perspective.
Thanks again for your help, Travis.
Regards,
Darren Ginter
On Wed, Apr 9, 2014 at 2:30 PM, Hegner, Travis <[email protected]>wrote:
> I originally sent the following on Friday afternoon and forgot to copy
> the list. Here it is for everyone’s benefit.
>
> --
>
> Hi Darren,
>
>
>
> The “Peer inserted to multicast list” message is indicating that an
> opennhrp peer has registered with this machine (typically this would appear
> on the next hop server). This is good to have often, especially if the
> spokes are dynamic. In the case of an address change, there will be
> connection issues to that spoke until the /new/ NBMA address (public IP) is
> registered at the opennhrp hub.
>
>
>
> With racoon, you have an ISAKMP lifetime (phase 1), and also, separately,
> an SA lifetime (phase 2). The SA typically has a built in overlap, such
> that when a new SA is negotiated between two hosts (typically at roughly
> 2/3rds the lifetime time), they send traffic encrypted with the new SA
> keys, however they are still supposed to /receive/ traffic with either the
> new, **or** the not yet expired SA. This allows for packets which are
> in-flight after a new SA renegotiation to still be accepted and decrypted
> by their destination.
>
>
>
> I’m a little less familiar with how the ISAKMP lifetime technically works,
> but I do know that the ISAKMP relationship is the framework, for lack of a
> better word, which allows the negotiation and creation of the SA. In most
> configuration examples I have seen, the ISAKMP lifetime is typically
> allowed to last much longer than an SA.
>
>
>
> The ISAKMP lifetime is set in the “proposal” section of your “remote”
> section. A common default there is 24 hours as you mention.
>
> The SA lifetime is set in the “sainfo” section, and 24 hours is far too
> long in an opennhrp environment in my opinion. Three hours is a common
> default there, but remember that is for a simple static ipsec
> implementation. In a Cisco DMVPN (which uses the NHRP protocol), they
> recommend an SA lifetime of only 2 minutes, which could add a lot of
> overhead if you have a large number of spokes.
>
>
>
> The settings you use are wholly dependent on your environment… and the SA
> lifetime is essentially the frequency which your encryption keys are
> rotated at. You’ll have to balance overhead with security.
>
>
>
> I’d be inclined to think that the connection freeze you are seeing is at
> the ISAKMP re-negotiation. You can see the creation date of the ISAKMP
> tunnel with the command “sudo racoonctl show-sa isakmp”. With that,
> extrapolate the expiration/re-negotiation date, and determine if that is in
> fact when you see the freeze. If so, it would be easy enough to reduce the
> occurrence by setting the “proposal” lifetime time to something like 24
> hours. Assuming an 8 hour work day, you have 2/3 probability that the
> re-key happens outside of business hours. And if it does happen during
> business hours, it will likely only be one time.
>
>
>
> You may also be able to reduce the likelihood of a freeze further with the
> “rekey force” option under your “remote” section in racoon.conf. If I
> understand it correctly, by default the ISAKMP tunnel will not be
> re-established until traffic has been passed through it. This could result
> in some lost packets in the time it takes to re-negotiate. The “rekey
> force” option tells racoon to re-negotiate the tunnel immediately at
> expiration, without waiting for user traffic.
>
>
>
> I’d be happy to see anyone’s comments on this topic who are more familiar
> with how it all works together, especially in a DMVPN environment, as we
> will be migrating our production network to an opennhrp based system this
> year.
>
>
>
> Thanks,
>
>
>
> Travis Hegner
>
> http://travishegner.com/
>
>
>
>
>
> *From:* Darren Ginter [mailto:[email protected]]
> *Sent:* Friday, April 04, 2014 9:42 AM
> *To:* [email protected]
> *Subject:* [opennhrp-devel] Network Freezing
>
>
>
> I have my opennhrp network up and running with only one issue at this
> point: the connections all seem to freeze for a quick second every now and
> then. Most applications don't have a problem but others do - Microsoft
> Outlook will indicate that the connection to Exchange has been lost and
> then quickly indicate that it has been restored.
>
>
>
> In checking the logs, I am seeing "Peer inserted to multicast list" every
> 20 minutes for each of my connections. Is there some way to increase this
> interval?
>
>
>
> Also, I am seeing some howto for racoon.conf that indicate "lifetime time
> 24 hours". Could this help?
>
> ------------------------------
> The information contained in this communication is confidential and is
> intended only for the use of the named recipient. Unauthorized use,
> disclosure, or copying is strictly prohibited and may be unlawful. If you
> have received this communication in error, you should know that you are
> bound to confidentiality, and should please immediately notify the sender.
>
------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
_______________________________________________
opennhrp-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opennhrp-devel