No, all tests were with exactly the same builds.  I then tested 3 times to
see if
any 5.1, 5.2, or current would work.  All three times I found the same
results.

"Faithless is he, who says 'farewell', when the path darkens."
"you just keep on trying till you run out of cake"


On Thu, Feb 21, 2013 at 2:41 PM, sven falempin <[email protected]>wrote:

> On Thu, Feb 21, 2013 at 2:08 PM, sangdrax8 <[email protected]> wrote:
>
> > I am new to OpenBSD, but would like to take advantage of a redundant
> > setup with ipsec/carp/sasync.  I have run into a situation which seems
> > to be a bug, but thought it best if I first bring my questions here to
> > see if there is something I am missing.
> >
> > I have tried the following with 5.1-stable, 5.2-stable, and my
> > 5.2-stable setup with a snapshot kernel from 2/17/2013.  My main problem
> > exists across all three setups.  My guess is that it seems the phase 1
> > of an ipsec negotiation is not being synced with sasync, but I will
> > describe my setup and results below and see if anyone else can assist me
> > with this.
> >
> >
> > My setup:
> > fw1 and fw2 - carp/ipsec/sasync
> > lab1 - ipsec
> >
> > Part that works as I expected it to:
> >
> > My fw1 and fw2 boxes are successfully running carp, and my fw1 is the
> > master.  Using a machine behind the firewalls, I can initiate the ipsec
> > tunnel by sending some icmp packets to a machine behind the lab1 box.
> > While tcpdumping on the fw1 and fw2 interfaces, I can see the phase1 and
> > phase2 of ipsec happen on fw1, and esp traffic passing.  I then verify
> > sasync by running 'ipsecctl -s a' on both fw1 and fw2.  They both match,
> > indicating that the SA created by the master did make it to the backup
> > machine.
> >
> > I then wish to test failover between the two redundant firewalls, so I
> > run 'ifconfig -g carp carpdemote 128' on the master machine.  I quickly
> > see the backup take over, and the esp packets start showing up on my
> > tcpdump on the backup machine.  I see the sequence numbers jump by
> > 16384, which I have read is expected. (side note, this increase causes
> > the tunnel to break in 5.2-stable, but was reported and seems fixed in
> > my snapshot kernel tests, as well as working in 5.1-stable)  Initially
> > this looks good, and even the spi's in use are the same.  So again
> > sasync seems to be working, and I have a successful tunnel transition.
> >
> > Where things seem to go wrong:
> >
> > At this point if I keep watching the tcpdump on my fw2 (now the master
> > passing traffic) I see that about one or two minutes after it takes
> > over, it initiates a phase 1 re-key of the ipsec tunnel (and therefore a
> > new phase 2 under this new phase1).  This happens quickly, and I can see
> > the spi's change as the new association is now the one being used.  This
> > re-key also resets the previously mentioned sequence numbers, making it
> > easy to see when it took place.  I think things have gone wrong here,
> > but traffic passes and will continue to re-key new phase 2 just fine.
> > So it isn't obvious that anything is wrong.
> >
> > Evidence something is wrong:
> >
> > I now allow fw1 to take back over master with 'ifconfig -g carp
> > -carpdemote 128' which also works.  I see the traffic now on my fw1
> > tcpdump window, and the spi's are the ones that were re-negotiated by
> > the backup when it did the strane phase1 and phase 2 rekey.  Once again
> > my sequence numbers jump by 16384, as expected.  Now watching the
> > tcpdump on fw1, I see that about one or two minutes in it attempts a
> > re-key, but not exactly like the backup one did when it took over.  It
> > only initiates a phase 2 re-key with the remote host.  This re-key is
> > attempted a few times, but always seems rejected by the lab1 side.
> > After waiting the default of nearly 20 minutes for phase 2 to expire,
> > the fw1 begins trying to get a phase 2 re-key again only to be denied
> > again by the lab machine.  Eventually the phase 2 expires, and all
> > traffic dies across the VPN.  It will stay dead, trying to re-key phase
> > 2 and being rejected by the lab1 machine.
> >
> > My best guess as to what is going on:
> >
> > So from the above sequence I am guessing that the sasync isn't actually
> > syncing a phase 1 between the fw1 and fw2.  Once the fw2 takes over, it
> > decides to re-key the phase 2 (perhaps due to high sequence numbers?)
> > but finds it has no valid phase 1 with which to talk to the lab machine.
> > It therefore initiates a new phase 1 negotiation with the lab machine,
> > which succeeds.  It follows this up with a phase 2, and traffic
> > continues to pass between these two boxes.  Now in this current state it
> > would (I am guessing here) imply that the fw1 has a non-expired phase 1
> > association with the lab box, which the lab box has replaced with a
> > newly negotiated phase 1 from fw2.  If fw2 tries to re-key phase 2,
> > everything works since fw2 and the lab box now agree on the phase 1
> > between them.  When I then allow fw1 to take back over as master, it
> > attempts to re-key phase 2(again maybe due to sequence numbers?) but is
> > apparently rejected by lab1.  Since this phase 2 synced, traffic
> > continues but eventually the writing is on the wall.  Once this phase 2
> > that was synced from fw2 expires, all traffic dies.  Fw1 will not be
> > able to get a new phase 2 until the phase 1 expires and it re-keys phase
> > 1 with the lab box.  The nail in the coffin for me was that once nothing
> > will pass, If i demote fw1 again and let fw2 take over, the phase 2
> > re-builds and traffic will begin passing.  This again makes me think
> > that the only valid phase 1 is between fw2 and the lab1 box.  Finally, I
> > rebooted fw1.  This cleared all SA's (aka the phase 1 that I believe it
> > still had).  When it came back up it took over as carp master, traffic
> > dropped for a short time while it re-built a phase 1 and phase 2, and
> > then traffic began passing again.
> >
> > This is easier to test on the snapshot kernel, because 5.1 doesn't seem
> > to support adjusting the timelimit for the phase 1 and 2 SA's.  I did
> > see this behavior in 5.1 it just took longer to test.
> >
> > I realize this is a long post, but I wanted to get some opinions before
> > just filing a bug report.  This is my first real attempt at getting
> > synchronized OpenBSD encryption devices running, and I would prefer is
> > someone else could verify what I am seeing.
> >
> >
> > "Faithless is he, who says 'farewell', when the path darkens."
> > "you just keep on trying till you run out of cake"
> >
> >
> are you replicating through different version ?
>
> --
>
> ---------------------------------------------------------------------------------------------------------------------
> () ascii ribbon campaign - against html e-mail
> /\

Reply via email to