On Mon, Nov 5, 2012 at 5:33 PM, Vladislav Bogdanov <bub...@hoster-ok.com> wrote: > 05.11.2012 08:40, Andrew Beekhof wrote: >> On Fri, Nov 2, 2012 at 6:22 PM, Vladislav Bogdanov <bub...@hoster-ok.com> >> wrote: >>> 02.11.2012 02:05, Andrew Beekhof wrote: >>>> On Thu, Nov 1, 2012 at 5:09 PM, Vladislav Bogdanov <bub...@hoster-ok.com> >>>> wrote: >>>>> 01.11.2012 02:47, Andrew Beekhof wrote: >>>>> ... >>>>>>> >>>>>>> One remark about that - it requires that gfs2 communicates with dlm in >>>>>>> the kernel space - so gfs_controld is not longer required. I think >>>>>>> Fedora 17 is the first version with that feature. And it is definitely >>>>>>> not available for EL6 (centos6 which I use). >>>>>>> >>>>>>> But I have preliminary success running GFS2 with corosync2 and pacemaker >>>>>>> 1.1.8 on EL6. dlm4 runs just fine as is (although it misses some >>>>>>> featured on EL6 because of kernel). And it still includes (not >>>>>>> documented) option enable_fscontrol, so user-space communication with fs >>>>>>> control daemons is supported. Even it that feature will be removed >>>>>>> upstream, it can be easily returned back - just several lines of code. >>>>>>> And I ported gfs_controld from cman to corosync2 (patch is very dirty >>>>>>> yet, made with scissors and needle, just a proof-of-concept that it even >>>>>>> can work). Some features are unsupported (f.e. nodir) and will not be >>>>>>> implemented by me. >>>>>> >>>>>> I'm impressed. What was the motivation though? You really really >>>>>> don't like CMAN? :-) >>>>> >>>>> Why should I like software which is going to die? ;) >>>>> >>>>> I believe that how things are done currently (third case from your list) >>>>> fully reflect my "perfectionistic" needs. I had many problems with >>>>> cman+pacemaker in a past. Most critical is that pacemaker and >>>>> dlm_controld react differently when node reappears back very soon after >>>>> if was lost (because pacemaker uses totem ? directly for membership, but >>>>> dlm uses CPG). >>>> >>>> We both get it from the CPG and quorum APIs for option 3. >>> >>> Yes, but not for 1 nor for 2. >> >> Not quite. We used to ignore it for option 2, but not anymore. >> Option 2 uses CPG for messaging. >> >>> I saw described behavior with both of >>> them, but not with 3. >>> That's why I decided to go with 3 which I think conceptually right. >>> >>>> >>>>> Pacemaker accepts that, but controld freezes lockspaces, >>>>> waiting for fencing. But fencing is never done because nobody handles >>>>> "node lost" CPG event. >>>> >>>> WTF. Pacemaker should absolutely do this. Bug report? >>> >>> Sorry for being unclear. >>> I saw that with both 1 and 2 (where pacemaker did not use CPG), until I >>> "fixed" fencing at dlm layer for 1. I modified it to request fencing if >>> "node down" event occurs and then did not see freezes anymore. From what >>> I understand, "node down" CPG event occurs when corosync forms >>> transitional membership (at least pacemaker logged lines about that at >>> the same time with dlm freeze. And if stable membership occurs >>> (milli-)seconds after transitional one, pacemaker (as of probable 1.1.6) >>> did not fence re-appeared node. I can understand that - pacemaker can >>> absolutely live with that. But dlm cannot. >> >> Right. Any sort of membership hiccup is fatal as far as the dlm is concerned. >> But even with options 1 and 2, it should still make a fencing request. > > I'm afraid no. At least not with 3.0.17 or 3.1.7.
Actually the system as a whole does, but you have to know where to look. Its fenced that triggers the node fencing on CPG change. Look for if (left_list[i].reason == CPG_REASON_NODEDOWN || left_list[i].reason == CPG_REASON_PROCDOWN) { memb->failed = 1; cg->failed_count++; in add_change() in fence/fenced/cpg.c and later: /* failed nodes in this change become victims */ add_victims(fd, cg); Better understanding of these interdependencies is why we no longer recommend starting cman via directives in corosync.conf - because that wont start fenced and any other bits that are needed. CTS has also improved to test the integration better. We've spent a lot of time recently specifically making sure that cman/fenced initiated fencing works just as well as pacemaker initiated fencing does. > Sources are clear > about that - CPG node down event does not result in fencing requested by > dlm_controld. And that was a major problem for me with options 1 and 2. > One-line patch solved that though. But I decided that cman is a no-go > for me anymore because such critical issues as proper fencing should be > tested thoroughly and if they are not, then I will feel like sitting on > a bomb with it. > >> >> Without fence_pcmk in cluster.conf that request might have gotten >> lost, but with 1.1.8 I would expect the node to be shot - regardless >> of whether the rest of Pacemaker thought it was ok. >> Thats why going direct to stonithd was an important change. > > Aha. I tried cman last time before fence_pcmk was written (and before > that fencing call dlm_controld.pcmk uses was modified to go straight to > stonithd). I recall I was polishing option 1 that time (after throwing > cman away), and first revision of that move did not work because it used > async libstonithd call to fence a node. That's why I used direct calls > to stonith in my version of dlm_controld.pcmk. All that resulted in > fully-working stack and I decided to go with option 3 only after hearing > from you that you do not test pacemaker with corosync1 yourselves anymore. > > That was second major problem with option 1 - before all that changes > there was a possibility for fencing request to be dropped silently. And > I actually hit that. I do not know if it fully works with stock 3.0.17 > dlm_controld.pcmk (I suspect no because of issue 1) but with my builds > it is stable. > > Anyways, I seem to be happy with option 3 on EL6, it introduces clean > and straight-forward model of cluster stack and it works perfectly, so I > do not see any reasons to return back to option 1 or 2. Happy to hear it. I'm not actually trying to make you stop using it :) > > >> >>> And it is its task to do >>> proper fencing in case it cannot work, not pacemaker's. But that piece >>> was missing there. The same is (probably, I may be damn wrong here) true >>> for cman - I did a quick search for a CPG "node down" handler in its >>> sources but didn't find one. I suspect it was handled by some deprecated >>> daemon (f.e. groupd) in the past, but as of 3.1.7 I did not observe >>> handling for that. >>> >>> As I go with option 3, I should not see that anymore even theoretically. >>> >>> So no bug report for what I wont use anymore :) >>> >>>> >>>>> dlm does start fencing for "process lost", but >>>>> not for "node lost". >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>> >>>>> Project Home: http://www.clusterlabs.org >>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> Bugs: http://bugs.clusterlabs.org >>>> >>>> _______________________________________________ >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: http://bugs.clusterlabs.org >>>> >>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org