On 08/07/15 17:41 -0700, Digimer wrote: > Second time in a couple of years I hit a weird bug... I have no idea how > to reproduce it, sadly. > > I added a VM to cluster.conf via ccs and it showed up on node 1 but not > node 2. I confirmed that node 2 had the updated cluster.conf, but the > running process wouldn't see it.
Could be weird synchronization problem as touched at the bottom of https://bugzilla.redhat.com/show_bug.cgi?id=1157951#c2 Just to limit already limited reproducibility (as I blindly expect those two cases to be related), I would highly recommend using equivalent of ccs-0.16.2-75.el6 or newer (from 6.7 Beta perhaps?). > I tried up'ing the version and running 'cman_tool version -r', no bueno > (tried on both nodes). I tried deleting the VM, pushing the new > cluster.conf and it disappeared from node 1 as expected. Added it back, > pushed again and again, only showed up on node 1's 'clustat'. I tried > again on the other node. No bueno... > > So I tried freezing all of the services on both nodes and restart > rgmanager on both nodes. The stop didn't touch the services, but > starting it back up tore down and restarted the services, messing up the > VMs. =/ I would expect there is a lot of undefined behavior when (un)freezing under uneven circumstances. > I noticed on node 2 the following during shutdown: > > ======== > Jul 8 20:24:57 rm-a01n02 rgmanager[3971]: Shutting down > Jul 8 20:25:07 rm-a01n02 rgmanager[3971]: Shutting down > Jul 8 20:25:08 rm-a01n02 rgmanager[3971]: Member 1 shutting down > Jul 8 20:25:13 rm-a01n02 rgmanager[3971]: Initializing vm:vm07-rhel6-temp > Jul 8 20:25:13 rm-a01n02 rgmanager[3971]: vm:vm07-rhel6-temp was added > to the config, but I am not initializing it. > Jul 8 20:25:13 rm-a01n02 rgmanager[3971]: Reconfiguring > Jul 8 20:25:16 rm-a01n02 rgmanager[3971]: Reconfiguring > Jul 8 20:25:17 rm-a01n02 rgmanager[3971]: Disconnecting from CMAN > Jul 8 20:25:18 rm-a01n02 rgmanager[3971]: Reconfiguring > Jul 8 20:25:21 rm-a01n02 rgmanager[3971]: Reconfiguring > Jul 8 20:25:24 rm-a01n02 rgmanager[3971]: Reconfiguring > Jul 8 20:25:27 rm-a01n02 rgmanager[3971]: Reconfiguring > Jul 8 20:25:32 rm-a01n02 rgmanager[3971]: Exiting > ======== > > Notice that at this point, the VM suddenly was found. > > So I am wondering; Is it possible to force the process above without > restarting rgmanager? You can try to substitute -HUP for -USR1 in the command stated https://bugzilla.redhat.com/show_bug.cgi?id=1157951#c4 but not really sure it will force re-reading the config. Anyway, the original=dumping form might also provide some insights. -- Jan
pgpA4xb7uxukt.pgp
Description: PGP signature
_______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
