On Wed, May 06, 2009 at 02:10:27PM -0700, Steven Dake wrote:
> On Wed, 2009-05-06 at 15:04 -0500, David Teigland wrote:
> > On Mon, Apr 13, 2009 at 02:17:00PM -0500, David Teigland wrote:
> > > On Mon, Apr 13, 2009 at 12:10:33PM -0700, Steven Dake wrote:
> > > > On Mon, 2009-04-13 at 13:35 -0500, David Teigland wrote:
> > > > > 0. configure token timeout to some long time that is longer than all 
> > > > > the
> > > > >    following steps take
> > > > > 
> > > > > 1. cluster members are nodeid's: 1,2,3,4
> > > > > 
> > > > > 2. cpg foo has the following members:
> > > > >    nodeid 1, pid 10
> > > > >    nodeid 2, pid 20
> > > > >    nodeid 3, pid 30
> > > > >    nodeid 4, pid 40
> > > > > 
> > > > > 3. nodeid 4: ifdown eth0, kill corosync, kill pid 40
> > > > >    (optionally reboot this node now)
> > > > > 
> > > > > 4. nodeid 4: ifup eth0, start corosync
> > > > > 
> > > > > 5. members of cpg foo (1:10, 2:20, 3:30) all get a confchg
> > > > >    showing that 4:40 is not a member
> > > > > 
> > > > > 6. nodeid 4: start process pid 41 that joins cpg foo
> > > > > 
> > > > > 7. members of cpg foo (1:10, 2:20, 3:30, 4:41) all get a confchg
> > > > >    showing that 4:41 is a member
> > > > > 
> > > > > (Steps 6 and 7 should work the same even if the process started in 
> > > > > step 6
> > > > > has pid 40 instead of pid 41.)
> > > 
> > > > 100% agree that is how it should work.  If it doesn't, we will fix it.
> > > > The only thing that may be strange is if pid in step 6 is the same pid
> > > > as 40.  Are you certain the test case which fails has a differing pid at
> > > > step 6?
> > > 
> > > If you fix step 5, then I suspect steps 6,7 will "just work".  After the 
> > > test
> > > failed at step 5 I didn't pay too much attention to 6,7... but I'm sure 
> > > that
> > > the pid in step 6 was different (I didn't reboot the node).
> > 
> > It's not clear what the plan was for this, any recent related changes I 
> > should
> > try?
> > Dave
> > 
> 
> I haven't tried corosync with this test case, but it should work now.
> Did you try latest corosync on this case?   If it still fails Jan can
> address before 1.0.

Just tried it, and I get the same behavior as before.
Dave

_______________________________________________
Openais mailing list
Openais@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/openais

Reply via email to