[ClusterLabs] Definition of function pointer cpg_totem_confchg_fn_t
typedef void (*cpg_totem_confchg_fn_t) ( cpg_handle_t handle, struct cpg_ring_id ring_id, uint32_t member_list_entries, const uint32_t *member_list); Should "struct cpg_ring_id ring_id" be "struct cpg_ring_id *ring_id"? Regards, Dashi Cao ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] agent ocf:pacemaker:controld
The manual "Pacemaker 1.1 Clusters from Scratch" gives the false impression that gfs2 relies only on dlm, but I cannot make it work without gfs_controld. Again this little daemon is heavily coupled with cman. I think it is quite hard to use gfs2 in a cluster build only using "pacemaker+corosync"! Am I wrong? Thanks a lot! Dashi Cao ________________ From: Da Shi Cao <dscao...@hotmail.com> Sent: Thursday, July 21, 2016 9:31:51 PM To: Cluster Labs - All topics related to open-source clustering welcomed Subject: Re: [ClusterLabs] agent ocf:pacemaker:controld I've built the dlm_tool suite using the source from https://git.fedorahosted.org/cgit/dlm.git/log/. The resource uisng ocf:pacemaker:controld will always fail to start because of timeout, even if start timeout is set to 120s! But if dlm_controld is first started outside the cluster management, then the resource will show up and stay well! Another question is what's the difference of dlm_controld and gfs_controld? Must they both be present if a cluster gfs file system is mounted? Thanks a lot! Dashi Cao ____________ From: Da Shi Cao <dscao...@hotmail.com> Sent: Wednesday, July 20, 2016 4:47:31 PM To: Cluster Labs - All topics related to open-source clustering welcomed Subject: Re: [ClusterLabs] agent ocf:pacemaker:controld Thank you all for the information about dlm_controld. I will make a try using https://git.fedorahosted.org/cgit/dlm.git/log/ . Dashi Cao From: Jan Pokorný <jpoko...@redhat.com> Sent: Monday, July 18, 2016 8:47:50 PM To: Cluster Labs - All topics related to open-source clustering welcomed Subject: Re: [ClusterLabs] agent ocf:pacemaker:controld > On 18/07/16 07:59, Da Shi Cao wrote: >> dlm_controld is very tightly coupled with cman. Wrong assumption. In fact, support for shipping ocf:pacemaker:controld has been explicitly restricted to cases when CMAN logic (specifically the respective handle-all initscript that is in turn, in that limited use case, triggered from pacemaker's proper one and, moreover, takes care of dlm_controld management on its own so any subsequent attempts to do the same would be ineffective) is _not_ around: https://github.com/ClusterLabs/pacemaker/commit/6a11d2069dcaa57b445f73b52f642f694e55caf3 (accidental syntactical typos were fixed later on: https://github.com/ClusterLabs/pacemaker/commit/aa5509df412cb9ea39ae3d3918e0c66c326cda77) >> I have built a cluster purely with >> pacemaker+corosync+fence_sanlock. But if agent >> ocf:pacemaker:controld is desired, dlm_controld must exist! I can >> only find it in cman. >> Can the command dlm_controld be obtained without bringing in cman? To recap what others have suggested: On 18/07/16 08:57 +0100, Christine Caulfield wrote: > There should be a package called 'dlm' that has a dlm_controld suitable > for use with pacemaker. On 18/07/16 17:26 +0800, Eric Ren wrote: > DLM upstream hosted here: > https://git.fedorahosted.org/cgit/dlm.git/log/ > > The name of DLM on openSUSE is libdlm. -- Jan (Poki) ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] agent ocf:pacemaker:controld
I've built the dlm_tool suite using the source from https://git.fedorahosted.org/cgit/dlm.git/log/. The resource uisng ocf:pacemaker:controld will always fail to start because of timeout, even if start timeout is set to 120s! But if dlm_controld is first started outside the cluster management, then the resource will show up and stay well! Another question is what's the difference of dlm_controld and gfs_controld? Must they both be present if a cluster gfs file system is mounted? Thanks a lot! Dashi Cao From: Da Shi Cao <dscao...@hotmail.com> Sent: Wednesday, July 20, 2016 4:47:31 PM To: Cluster Labs - All topics related to open-source clustering welcomed Subject: Re: [ClusterLabs] agent ocf:pacemaker:controld Thank you all for the information about dlm_controld. I will make a try using https://git.fedorahosted.org/cgit/dlm.git/log/ . Dashi Cao From: Jan Pokorný <jpoko...@redhat.com> Sent: Monday, July 18, 2016 8:47:50 PM To: Cluster Labs - All topics related to open-source clustering welcomed Subject: Re: [ClusterLabs] agent ocf:pacemaker:controld > On 18/07/16 07:59, Da Shi Cao wrote: >> dlm_controld is very tightly coupled with cman. Wrong assumption. In fact, support for shipping ocf:pacemaker:controld has been explicitly restricted to cases when CMAN logic (specifically the respective handle-all initscript that is in turn, in that limited use case, triggered from pacemaker's proper one and, moreover, takes care of dlm_controld management on its own so any subsequent attempts to do the same would be ineffective) is _not_ around: https://github.com/ClusterLabs/pacemaker/commit/6a11d2069dcaa57b445f73b52f642f694e55caf3 (accidental syntactical typos were fixed later on: https://github.com/ClusterLabs/pacemaker/commit/aa5509df412cb9ea39ae3d3918e0c66c326cda77) >> I have built a cluster purely with >> pacemaker+corosync+fence_sanlock. But if agent >> ocf:pacemaker:controld is desired, dlm_controld must exist! I can >> only find it in cman. >> Can the command dlm_controld be obtained without bringing in cman? To recap what others have suggested: On 18/07/16 08:57 +0100, Christine Caulfield wrote: > There should be a package called 'dlm' that has a dlm_controld suitable > for use with pacemaker. On 18/07/16 17:26 +0800, Eric Ren wrote: > DLM upstream hosted here: > https://git.fedorahosted.org/cgit/dlm.git/log/ > > The name of DLM on openSUSE is libdlm. -- Jan (Poki) ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] agent ocf:pacemaker:controld
dlm_controld is very tightly coupled with cman. I have built a cluster purely with pacemaker+corosync+fence_sanlock. But if agent ocf:pacemaker:controld is desired, dlm_controld must exist! I can only find it in cman. Can the command dlm_controld be obtained without bringing in cman? Best Regards Dashi Cao ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] fence_sanlock and pacemaker
Hello everybody, After some try and error, fence_sanlock can be used as a stonith resource in pacemaker+corosync. 1. Add a "monitor" action, which is exactly the same action as "status". 2. Make "status" action return "false" if a resource belongs to a host is acquired and owned by another host. It returned "true" erroneously since it didn't make a test on the owner id of a resource in version 3.3.0. 3. Make fence_sanlockd try for several times before it failed if the resource for a host is owned by another host. This gives a time window for the resource to be released manually at the other host. Sometimes a resource of a host get locked permanently by another host if the "off" action failed, often in time out. Best Regards Dashi Cao ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] pacemaker and fence_sanlock
Hello Ken, Yes, I installed the fence_sanlock and fence_sanlockd in the same directory with corosync and pacemaker. There is no monitor action as listed from stonith_admin --meta --agent=fence_sanlock, and no monitor action from "crm_resource --show-metadata=stonith:fence_sanlock", either. So I guess I will have to fake a monitor action using "fence_sanlock -p path -i id -o status" Thank you very much! Dashi Cao From: Ken Gaillot <kgail...@redhat.com> Sent: Thursday, May 12, 2016 10:32:33 PM To: users@clusterlabs.org Subject: Re: [ClusterLabs] pacemaker and fence_sanlock On 05/11/2016 09:14 PM, Da Shi Cao wrote: > Dear all, > > I'm just beginning to use pacemaker+corosync as our HA solution on > Linux, but I got stuck at the stage of configuring fencing. > > Pacemaker 1.1.15, Corosync Cluster Engine, version '2.3.5.46-d245', and > sanlock 3.3.0 (built May 10 2016 05:13:12) > > I have the following questions: > > 1. stonith_admin --list-installed will only list two agents: fence_pcmk, > fence_legacy before sanlock is compiled and installed under /usr/local. > But after "make install" of sanlock, stonith_admin --list-installed will > list: > > fence_sanlockd > fence_sanlock > fence_pcmk > fence_legacy > It is weird and I wonder what makes stonith_admin know about fence_sanlock? I'm guessing you also installed pacemaker under /usr/local; stonith_admin will simply list $installdir/sbin/fence_* > 2. How to configure the fencing by fence_sanlock into pacemaker? I've > tried to create a new resource to do the unfencing for each node, but > the resource start will fail since there is no monitor operation of > fence_sanlock agent, because resource manager will fire monitor once > after the start to make sure it has been started OK. I'm not familiar with fence_sanlock, but it should be fine to do what you describe. There's probably an issue with your configuration. What command did you use to configure the resource? > 3. How to create a fencing resource to do the fencing by sanlock. This > I've not tried yet. But I wonder which node/nodes of the majority will > initiate the fence operations to the nodes without quorum. Once you've defined the resource in the pacemaker configuration, the cluster will intelligently decide when and how to call it. When you check the cluster status, you'll see that the fence device is "running" on one node. In fact, any node can use the fence device (assuming the configuration doesn't specifically ban it); the listed node is the one running the recurring monitor on the resource. The cluster considers that node to have "verified" access to the device, so it will prefer that node when fencing using the device -- but it may decide to choose another node when appropriate. You may be interested to know that pacemaker has recently gained native support for watchdog-based fencing via the "sbd" software package. See: http://blog.clusterlabs.org/blog/2015/sbd-fun-and-profit/ http://clusterlabs.org/wiki/Using_SBD_with_Pacemaker Some discussion of common configuration issues can be seen at: https://bugzilla.redhat.com/show_bug.cgi?id=1221680 If you have a Red Hat subscription, Red Hat has a simple walk-through for configuring sbd with pacemaker on RHEL 6.8+/7.1+ (using watchdog only, no "poison pill" shared storage): https://access.redhat.com/articles/2212861 > Thank you very much. > Dashi Cao ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] pacemaker and fence_sanlock
Dear all, I'm just beginning to use pacemaker+corosync as our HA solution on Linux, but I got stuck at the stage of configuring fencing. Pacemaker 1.1.15, Corosync Cluster Engine, version '2.3.5.46-d245', and sanlock 3.3.0 (built May 10 2016 05:13:12) I have the following questions: 1. stonith_admin --list-installed will only list two agents: fence_pcmk, fence_legacy before sanlock is compiled and installed under /usr/local. But after "make install" of sanlock, stonith_admin --list-installed will list: fence_sanlockd fence_sanlock fence_pcmk fence_legacy It is weird and I wonder what makes stonith_admin know about fence_sanlock? 2. How to configure the fencing by fence_sanlock into pacemaker? I've tried to create a new resource to do the unfencing for each node, but the resource start will fail since there is no monitor operation of fence_sanlock agent, because resource manager will fire monitor once after the start to make sure it has been started OK. 3. How to create a fencing resource to do the fencing by sanlock. This I've not tried yet. But I wonder which node/nodes of the majority will initiate the fence operations to the nodes without quorum. Thank you very much. Dashi Cao ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org