On 14/07/15 10:34 AM, Jan Pokorný wrote: > On 08/07/15 17:41 -0700, Digimer wrote: >> Second time in a couple of years I hit a weird bug... I have no idea how >> to reproduce it, sadly. >> >> I added a VM to cluster.conf via ccs and it showed up on node 1 but not >> node 2. I confirmed that node 2 had the updated cluster.conf, but the >> running process wouldn't see it. > > Could be weird synchronization problem as touched at the bottom of > https://bugzilla.redhat.com/show_bug.cgi?id=1157951#c2 > > Just to limit already limited reproducibility (as I blindly expect those > two cases to be related), I would highly recommend using equivalent of > ccs-0.16.2-75.el6 or newer (from 6.7 Beta perhaps?). > >> I tried up'ing the version and running 'cman_tool version -r', no bueno >> (tried on both nodes). I tried deleting the VM, pushing the new >> cluster.conf and it disappeared from node 1 as expected. Added it back, >> pushed again and again, only showed up on node 1's 'clustat'. I tried >> again on the other node. No bueno... >> >> So I tried freezing all of the services on both nodes and restart >> rgmanager on both nodes. The stop didn't touch the services, but >> starting it back up tore down and restarted the services, messing up the >> VMs. =/ > > I would expect there is a lot of undefined behavior when (un)freezing > under uneven circumstances. > >> I noticed on node 2 the following during shutdown: >> >> ======== >> Jul 8 20:24:57 rm-a01n02 rgmanager[3971]: Shutting down >> Jul 8 20:25:07 rm-a01n02 rgmanager[3971]: Shutting down >> Jul 8 20:25:08 rm-a01n02 rgmanager[3971]: Member 1 shutting down >> Jul 8 20:25:13 rm-a01n02 rgmanager[3971]: Initializing vm:vm07-rhel6-temp >> Jul 8 20:25:13 rm-a01n02 rgmanager[3971]: vm:vm07-rhel6-temp was added >> to the config, but I am not initializing it. >> Jul 8 20:25:13 rm-a01n02 rgmanager[3971]: Reconfiguring >> Jul 8 20:25:16 rm-a01n02 rgmanager[3971]: Reconfiguring >> Jul 8 20:25:17 rm-a01n02 rgmanager[3971]: Disconnecting from CMAN >> Jul 8 20:25:18 rm-a01n02 rgmanager[3971]: Reconfiguring >> Jul 8 20:25:21 rm-a01n02 rgmanager[3971]: Reconfiguring >> Jul 8 20:25:24 rm-a01n02 rgmanager[3971]: Reconfiguring >> Jul 8 20:25:27 rm-a01n02 rgmanager[3971]: Reconfiguring >> Jul 8 20:25:32 rm-a01n02 rgmanager[3971]: Exiting >> ======== >> >> Notice that at this point, the VM suddenly was found. >> >> So I am wondering; Is it possible to force the process above without >> restarting rgmanager? > > You can try to substitute -HUP for -USR1 in the command stated > https://bugzilla.redhat.com/show_bug.cgi?id=1157951#c4 > but not really sure it will force re-reading the config. > > Anyway, the original=dumping form might also provide some insights.
I will try to find a reproducer when I get home. As I mentioned in that ticket, I hit it again on Sunday, but this time without using ccs. So I think we can rule it out. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? _______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
