HI Ulrich,

It's not that it is not working for me, it is that to make it, at least
appear, to work for me, I've had to modify dlm.conf -- which I have found
zero mention of this being necessary in any of the tutorials or walk
throughs I've read.  I was curious what ramifications I would encounter by
setting 'enable_fencing=0' in dlm.conf.

>From what I can tell, my config is sane with regard to interleave, colo,
and ordering.

SBD is a possibility we're considering, but haven't fully embraced, as of
yet.

I'm still planning on testing 'enable_startup_fencing=0' instead of
'enable_fencing=0' in dlm.conf and will report back (in case anyone is
interested).

Best,
-Pat

On Tue, Oct 2, 2018 at 2:25 AM Ulrich Windl <
ulrich.wi...@rz.uni-regensburg.de> wrote:

> Hi!
>
> I'm sorry that DLM/cLVM does not work for you. Did you double-check the
> configuration (meta interleave=true, colocation and ordering), especially
> the clones?
> Also as you have shared storage, why don't you use SBD for fencing?
>
> Regards,
> Ulrich
>
>
> >>> Patrick Whitney <pwhit...@luminoso.com> schrieb am 01.10.2018 um
> 22:01 in
> Nachricht
> <cae0zlk_va6gthz9tg3woecua2ridaehaoq8ieqdz4meokcy...@mail.gmail.com>:
> > Hi Ulrich,
> >
> > When I first encountered this issue, I posted this:
> >
> > https://lists.clusterlabs.org/pipermail/users/2018-September/015637.html
> >
> > ... I was using resource fencing in this example, but, as I've mentioned
> > before, the issue would come about, not when fencing occurred, but when
> the
> > fenced node was shutdown (we were using resource fencing).
> >
> > During that discussion, yourself and others suggested that power fencing
> > was the only way DLM was going to cooperate and one suggestion of using
> > meatware was proposed.
> >
> > Unfortunately, I found out later that meatware was no longer available (
> > https://lists.clusterlabs.org/pipermail/users/2018-September/015715.html
> ),
> > so we were lucky enough our test environment is a KVM/libvirt
> environment,
> > so I used fence_virsh.  Again, I had the same problem... when the "bad"
> > node was fenced, dlm_controld would issue (what appears to be) a
> fence_all,
> > and I would receive messages that that the dlm clone was down on all
> > members and would have a log message that the clvm lockspace was
> > abandoned.
> >
> > It was only when I disabled fencing for dlm (enable_fencing=0 in
> dlm.conf;
> > but kept fencing enabled in pcmk) did things begin to work as expected.
> >
> > One suggestion earlier in this thread suggests trying the dlm
> configuration
> > of  disabling startup fencing (enable_startup_fencing=0), which sounds
> like
> > a plausible solution after looking over the logs, but I haven't tested
> > yet.
> >
> > The conclusion I'm coming to is:
> > 1. The reason DLM cannot handle resource fencing is because it keeps its
> > own "heartbeat/control" channel (for lack of a better term) via the
> > network, and pcmk cannot instruct DLM "Don't worry about that guy over
> > there" which means we must use power fencing, but;
> > 2. DLM does not like to see one of its members disappear; when that does
> > happen, DLM does "something" which causes the lockspace to disappear...
> > unless you disable fencing for DLM.
> >
> > I am now speculating that DLM restarts when the communications fail, and
> > the theory that disabling startup fencing for DLM
> > (enable_startup_fencing=0) may be the solution to my problem (reverting
> my
> > enable_fencing=0 DLM config).
> >
> > Best,
> > -Pat
> >
> > On Mon, Oct 1, 2018 at 3:38 PM Ulrich Windl <
> > ulrich.wi...@rz.uni-regensburg.de> wrote:
> >
> >> Hi!
> >>
> >> It would be much more helpful, if you could provide logs around the
> >> problem events. Personally I think you _must_ implement proper fencing.
> In
> >> addition, DLM seems to do its own fencing when there is a communication
> >> problem.
> >>
> >> Regards,
> >> Ulrich
> >>
> >>
> >> >>> Patrick Whitney <pwhit...@luminoso.com> 01.10.18 16.25 Uhr >>>
> >> Hi Everyone,
> >>
> >> I wanted to solicit input on my configuration.
> >>
> >> I have a two node (test) cluster running corosync/pacemaker with DLM and
> >> CLVM.
> >>
> >> I was running into an issue where when one node failed, the remaining
> node
> >> would appear to do the right thing, from the pcmk perspective, that is.
> >>  It would  create a new cluster (of one) and fence the other node, but
> >> then, rather surprisingly, DLM would see the other node offline, and it
> >> would go offline itself, abandoning the lockspace.
> >>
> >> I changed my DLM settings to "enable_fencing=0", disabling DLM fencing,
> and
> >> our tests are now working as expected.
> >>
> >> I'm a little concern I have masked an issue by doing this, as in all of
> the
> >> tutorials and docs I've read, there is no mention of having to configure
> >> DLM whatsoever.
> >>
> >> Is anyone else running a similar stack and can comment?
> >>
> >> Best,
> >> -Pat
> >> --
> >> Patrick Whitney
> >> DevOps Engineer -- Tools
> >>
> >> _______________________________________________
> >> Users mailing list: Users@clusterlabs.org
> >> https://lists.clusterlabs.org/mailman/listinfo/users
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >>
> >
> >
> > --
> > Patrick Whitney
> > DevOps Engineer -- Tools
>
>
>
> _______________________________________________
> Users mailing list: Users@clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>


-- 
Patrick Whitney
DevOps Engineer -- Tools
_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to