[ClusterLabs] Antw: emergency stop does not honor resources ordering constraints (?)
>>> Radoslaw Garbacz schrieb am 06.12.2016 >>> um 18:50 in Nachricht : > Hi, > > I have encountered a problem with pacemaker resources shutdown in case of > (seems like) any emergency situation, when order constraints are not > honored. > I would be grateful for any information, whether this behavior is > intentional or should not happen (i.e. some testing issue rather then > pacemaker behavior). It would also be helpful to know if there is any > configuration parameter altering this, or whether there can be any reason > (cluster event) triggering not ordered resources stop. > > Thanks, > > To illustrate the issue I provide an example below and my collected data. > My environment uses resources cloning feature - maybe this contributes to > my tests outcome. > > > * Example: > - having resources ordered with constraints: A -> B -> C > - when stopping with 'crm_resources' command (all at once) resources are > stopped: C, B, A > - when stopping by terminating pacemaker resources are stopped: C, B, A > - when there is a monitoring error or quorum lost: no order is honored e.g. > B, C, A Hi! If the node does not have quorum, it cannot do any cluster operations (IMHO). Instead it will try to commit suicide, maby with the help of self-fencing. So I think this case is normal for no quorum. Ulrich > > > > * Version details: > Pacemaker 1.1.15-1.1f8e642.git.el6 > Corosync Cluster Engine, version '2.4.1.2-0da1' > > > > * My ordering constraints: > Ordering Constraints: > dbx_first_primary then dbx_head_head (kind:Mandatory) > dbx_first_primary-clone then dbx_head_head (kind:Mandatory) > dbx_head_head then dbx_mounts_nodes (kind:Mandatory) > dbx_head_head then dbx_mounts_nodes-clone (kind:Mandatory) > dbx_mounts_nodes then dbx_bind_mounts_nodes (kind:Mandatory) > dbx_mounts_nodes-clone then dbx_bind_mounts_nodes-clone (kind:Mandatory) > dbx_bind_mounts_nodes then dbx_nfs_nodes (kind:Mandatory) > dbx_bind_mounts_nodes-clone then dbx_nfs_nodes-clone (kind:Mandatory) > dbx_nfs_nodes then dbx_gss_datas (kind:Mandatory) > dbx_nfs_nodes-clone then dbx_gss_datas-clone (kind:Mandatory) > dbx_gss_datas then dbx_nfs_mounts_datas (kind:Mandatory) > dbx_gss_datas-clone then dbx_nfs_mounts_datas-clone (kind:Mandatory) > dbx_nfs_mounts_datas then dbx_swap_nodes (kind:Mandatory) > dbx_nfs_mounts_datas-clone then dbx_swap_nodes-clone (kind:Mandatory) > dbx_swap_nodes then dbx_sync_head (kind:Mandatory) > dbx_swap_nodes-clone then dbx_sync_head (kind:Mandatory) > dbx_sync_head then dbx_dbx_datas (kind:Mandatory) > dbx_sync_head then dbx_dbx_datas-clone (kind:Mandatory) > dbx_dbx_datas then dbx_dbx_head (kind:Mandatory) > dbx_dbx_datas-clone then dbx_dbx_head (kind:Mandatory) > dbx_dbx_head then dbx_web_head (kind:Mandatory) > dbx_web_head then dbx_ready_primary (kind:Mandatory) > dbx_web_head then dbx_ready_primary-clone (kind:Mandatory) > > > > * Pacemaker stop (OK): > ready.ocf.sh(dbx_ready_primary)[18639]: 2016/12/06_15:40:32 INFO: > ready_stop: Stopping resource > mng.ocf.sh(dbx_mng_head)[20312]:2016/12/06_15:40:44 INFO: mng_stop: > Stopping resource > web.ocf.sh(dbx_web_head)[20310]:2016/12/06_15:40:44 INFO: > dbxcl_stop: Stopping resource > dbx.ocf.sh(dbx_dbx_head)[20569]:2016/12/06_15:40:46 INFO: > dbxcl_stop: Stopping resource > sync.ocf.sh(dbx_sync_head)[20719]: 2016/12/06_15:40:54 INFO: > sync_stop: Stopping resource > swap.ocf.sh(dbx_swap_nodes)[21053]: 2016/12/06_15:40:56 INFO: > swap_stop: Stopping resource > nfs.ocf.sh(dbx_nfs_nodes)[21151]: 2016/12/06_15:40:58 INFO: nfs_stop: > Stopping resource > dbx_mounts.ocf.sh(dbx_bind_mounts_nodes)[21344]:2016/12/06_15:40:59 > INFO: dbx_mounts_stop: Stopping resource > dbx_mounts.ocf.sh(dbx_mounts_nodes)[21767]: 2016/12/06_15:41:01 INFO: > dbx_mounts_stop: Stopping resource > head.ocf.sh(dbx_head_head)[22213]: 2016/12/06_15:41:04 INFO: > head_stop: Stopping resource > first.ocf.sh(dbx_first_primary)[22999]: 2016/12/06_15:41:11 INFO: > first_stop: Stopping resource > > > > * Quorum lost: > sync.ocf.sh(dbx_sync_head)[23099]: 2016/12/06_16:42:04 INFO: > sync_stop: Stopping resource > nfs.ocf.sh(dbx_nfs_nodes)[23102]: 2016/12/06_16:42:04 INFO: nfs_stop: > Stopping resource > mng.ocf.sh(dbx_mng_head)[23101]:2016/12/06_16:42:04 INFO: mng_stop: > Stopping resource > ready.ocf.sh(dbx_ready_primary)[23104]: 2016/12/06_16:42:04 INFO: > ready_stop: Stopping resource > web.ocf.sh(dbx_web_head)[23344]:2016/12/06_16:42:04 INFO: > dbxcl_stop: Stopping resource > dbx_mounts.ocf.sh(dbx_bind_mounts_nodes)[23664]:2016/12/06_16:42:05 > INFO: dbx_mounts_stop: Stopping resource > dbx_mounts.ocf.sh(dbx_mounts_nodes)[24459]: 2016/12/06_16:42:08 INFO: > dbx_mounts_stop: Stopping resource > head.ocf.sh(dbx_head_head)[25036]: 2016/12/06_16:42:11 INFO: > head_stop: Stopping resource > swap.ocf.sh(dbx_swap_nodes)[27491]: 2016/12/06_16:43
[ClusterLabs] Antw: Re: Error performing operation: Argument list too long
>>> Ken Gaillot schrieb am 06.12.2016 um 16:44 in Nachricht <58329eaf-cbe0-55e0-7648-849879f1b...@redhat.com>: > On 12/05/2016 02:29 PM, Shane Lawrence wrote: >> I'm experiencing a strange issue with pacemaker. It is unable to check >> the status of a systemd resource. >> >> systemctl shows that the service crashed: >> [root@xx ~]# systemctl status rsyslog >> ● rsyslog.service - System Logging Service >>Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; >> vendor preset: enabled) >>Active: inactive (dead) since Mon 2016-12-05 07:41:11 UTC; 12h ago >> Docs: man:rsyslogd(8) >>http://www.rsyslog.com/doc/ >> Main PID: 22703 (code=exited, status=0/SUCCESS) >> >> Dec 02 21:41:41 xx...xx systemd[1]: Starting Cluster >> Controlled rsyslog... >> Dec 02 21:41:41 xx...xx systemd[1]: Started Cluster >> Controlled rsyslog. >> Dec 05 07:41:08 xx...xx systemd[1]: Stopping System >> Logging Service... >> Dec 05 07:41:11 xx...xx systemd[1]: Stopped System >> Logging Service. >> Dec 05 07:41:40 xx...xx systemd[1]: Stopped System >> Logging Service. >> >> Attempting to view the status through Pacemaker shows: >> [root@xx ~]# crm_resource --force-check -V -r rsyslog >> Error performing operation: Argument list too long >> [root@xx ~]# pcs resource debug-monitor rsyslog --full >> Error performing operation: Argument list too long >> >> The problem seems to be resolved (temporarily) by restarting corosync >> and then starting the cluster again. >> >> Has anyone else experienced this? > > That is odd behavior. You may want to open a bug report at > bugs.clusterlabs.org and attach your configuration and logs. > > On Linux, the system error number for "Argument list too long" is the > same as the OCF monitor status "Not running", so I suspect that it's a > display issue rather than an actual error, but I'm not sure. If it's strerror() with the wrong type of argument, it's rather a programming error than a display error ;-) > > Then the question would just be why is rsyslog stopping. > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Error performing operation: Argument list too long
06.12.2016 20:41, Jan Pokorný пишет: > On 06/12/16 09:44 -0600, Ken Gaillot wrote: >> On 12/05/2016 02:29 PM, Shane Lawrence wrote: >>> I'm experiencing a strange issue with pacemaker. It is unable to check >>> the status of a systemd resource. >>> >>> systemctl shows that the service crashed: No, it shows that service process exited gracefully without errors. There is no indication of "crash" in output you posted. >>> [root@xx ~]# systemctl status rsyslog >>> ● rsyslog.service - System Logging Service >>>Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; >>> vendor preset: enabled) >>>Active: inactive (dead) since Mon 2016-12-05 07:41:11 UTC; 12h ago >>> Docs: man:rsyslogd(8) >>>http://www.rsyslog.com/doc/ >>> Main PID: 22703 (code=exited, status=0/SUCCESS) >>> >>> Dec 02 21:41:41 xx...xx systemd[1]: Starting Cluster >>> Controlled rsyslog... >>> Dec 02 21:41:41 xx...xx systemd[1]: Started Cluster >>> Controlled rsyslog. >>> Dec 05 07:41:08 xx...xx systemd[1]: Stopping System >>> Logging Service... >>> Dec 05 07:41:11 xx...xx systemd[1]: Stopped System >>> Logging Service. >>> Dec 05 07:41:40 xx...xx systemd[1]: Stopped System >>> Logging Service. >>> >>> Attempting to view the status through Pacemaker shows: >>> [root@xx ~]# crm_resource --force-check -V -r rsyslog >>> Error performing operation: Argument list too long >>> [root@xx ~]# pcs resource debug-monitor rsyslog --full >>> Error performing operation: Argument list too long >>> >>> The problem seems to be resolved (temporarily) by restarting corosync >>> and then starting the cluster again. >>> >>> Has anyone else experienced this? >> >> That is odd behavior. You may want to open a bug report at >> bugs.clusterlabs.org and attach your configuration and logs. >> >> On Linux, the system error number for "Argument list too long" is the >> same as the OCF monitor status "Not running", so I suspect that it's a >> display issue rather than an actual error, but I'm not sure. >> >> Then the question would just be why is rsyslog stopping. > > Even more that "Cluster Controlled rsyslog has been started" while > "System Logging Service" is being stopped. Could it be a result > of a namespace/daemon/service clash of some kind? > systemctl status does simplistic match by unit name, rsyslog.service in this case, to show logs. It is entirely possible that between unit start and unit stop its definition was changed and systemd reloaded (which often happens implicitly during packages install at the very least). This would explain such behavior. дек 07 06:45:59 bor-Latitude-E5450 systemd[1]: Starting Before reload foo... дек 07 06:46:09 bor-Latitude-E5450 systemd[1]: Stopped Before reload foo. дек 07 06:46:35 bor-Latitude-E5450 systemd[1]: Started Before reload foo. Edit Description of unit and do systemctl daemon-reload дек 07 06:47:01 bor-Latitude-E5450 systemd[1]: Stopping After reload foo... дек 07 06:47:01 bor-Latitude-E5450 systemd[1]: Stopped After reload foo. signature.asc Description: OpenPGP digital signature ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] emergency stop does not honor resources ordering constraints (?)
Hi, I have encountered a problem with pacemaker resources shutdown in case of (seems like) any emergency situation, when order constraints are not honored. I would be grateful for any information, whether this behavior is intentional or should not happen (i.e. some testing issue rather then pacemaker behavior). It would also be helpful to know if there is any configuration parameter altering this, or whether there can be any reason (cluster event) triggering not ordered resources stop. Thanks, To illustrate the issue I provide an example below and my collected data. My environment uses resources cloning feature - maybe this contributes to my tests outcome. * Example: - having resources ordered with constraints: A -> B -> C - when stopping with 'crm_resources' command (all at once) resources are stopped: C, B, A - when stopping by terminating pacemaker resources are stopped: C, B, A - when there is a monitoring error or quorum lost: no order is honored e.g. B, C, A * Version details: Pacemaker 1.1.15-1.1f8e642.git.el6 Corosync Cluster Engine, version '2.4.1.2-0da1' * My ordering constraints: Ordering Constraints: dbx_first_primary then dbx_head_head (kind:Mandatory) dbx_first_primary-clone then dbx_head_head (kind:Mandatory) dbx_head_head then dbx_mounts_nodes (kind:Mandatory) dbx_head_head then dbx_mounts_nodes-clone (kind:Mandatory) dbx_mounts_nodes then dbx_bind_mounts_nodes (kind:Mandatory) dbx_mounts_nodes-clone then dbx_bind_mounts_nodes-clone (kind:Mandatory) dbx_bind_mounts_nodes then dbx_nfs_nodes (kind:Mandatory) dbx_bind_mounts_nodes-clone then dbx_nfs_nodes-clone (kind:Mandatory) dbx_nfs_nodes then dbx_gss_datas (kind:Mandatory) dbx_nfs_nodes-clone then dbx_gss_datas-clone (kind:Mandatory) dbx_gss_datas then dbx_nfs_mounts_datas (kind:Mandatory) dbx_gss_datas-clone then dbx_nfs_mounts_datas-clone (kind:Mandatory) dbx_nfs_mounts_datas then dbx_swap_nodes (kind:Mandatory) dbx_nfs_mounts_datas-clone then dbx_swap_nodes-clone (kind:Mandatory) dbx_swap_nodes then dbx_sync_head (kind:Mandatory) dbx_swap_nodes-clone then dbx_sync_head (kind:Mandatory) dbx_sync_head then dbx_dbx_datas (kind:Mandatory) dbx_sync_head then dbx_dbx_datas-clone (kind:Mandatory) dbx_dbx_datas then dbx_dbx_head (kind:Mandatory) dbx_dbx_datas-clone then dbx_dbx_head (kind:Mandatory) dbx_dbx_head then dbx_web_head (kind:Mandatory) dbx_web_head then dbx_ready_primary (kind:Mandatory) dbx_web_head then dbx_ready_primary-clone (kind:Mandatory) * Pacemaker stop (OK): ready.ocf.sh(dbx_ready_primary)[18639]: 2016/12/06_15:40:32 INFO: ready_stop: Stopping resource mng.ocf.sh(dbx_mng_head)[20312]:2016/12/06_15:40:44 INFO: mng_stop: Stopping resource web.ocf.sh(dbx_web_head)[20310]:2016/12/06_15:40:44 INFO: dbxcl_stop: Stopping resource dbx.ocf.sh(dbx_dbx_head)[20569]:2016/12/06_15:40:46 INFO: dbxcl_stop: Stopping resource sync.ocf.sh(dbx_sync_head)[20719]: 2016/12/06_15:40:54 INFO: sync_stop: Stopping resource swap.ocf.sh(dbx_swap_nodes)[21053]: 2016/12/06_15:40:56 INFO: swap_stop: Stopping resource nfs.ocf.sh(dbx_nfs_nodes)[21151]: 2016/12/06_15:40:58 INFO: nfs_stop: Stopping resource dbx_mounts.ocf.sh(dbx_bind_mounts_nodes)[21344]:2016/12/06_15:40:59 INFO: dbx_mounts_stop: Stopping resource dbx_mounts.ocf.sh(dbx_mounts_nodes)[21767]: 2016/12/06_15:41:01 INFO: dbx_mounts_stop: Stopping resource head.ocf.sh(dbx_head_head)[22213]: 2016/12/06_15:41:04 INFO: head_stop: Stopping resource first.ocf.sh(dbx_first_primary)[22999]: 2016/12/06_15:41:11 INFO: first_stop: Stopping resource * Quorum lost: sync.ocf.sh(dbx_sync_head)[23099]: 2016/12/06_16:42:04 INFO: sync_stop: Stopping resource nfs.ocf.sh(dbx_nfs_nodes)[23102]: 2016/12/06_16:42:04 INFO: nfs_stop: Stopping resource mng.ocf.sh(dbx_mng_head)[23101]:2016/12/06_16:42:04 INFO: mng_stop: Stopping resource ready.ocf.sh(dbx_ready_primary)[23104]: 2016/12/06_16:42:04 INFO: ready_stop: Stopping resource web.ocf.sh(dbx_web_head)[23344]:2016/12/06_16:42:04 INFO: dbxcl_stop: Stopping resource dbx_mounts.ocf.sh(dbx_bind_mounts_nodes)[23664]:2016/12/06_16:42:05 INFO: dbx_mounts_stop: Stopping resource dbx_mounts.ocf.sh(dbx_mounts_nodes)[24459]: 2016/12/06_16:42:08 INFO: dbx_mounts_stop: Stopping resource head.ocf.sh(dbx_head_head)[25036]: 2016/12/06_16:42:11 INFO: head_stop: Stopping resource swap.ocf.sh(dbx_swap_nodes)[27491]: 2016/12/06_16:43:08 INFO: swap_stop: Stopping resource -- Best Regards, Radoslaw Garbacz XtremeData Incorporation ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Error performing operation: Argument list too long
On 06/12/16 09:44 -0600, Ken Gaillot wrote: > On 12/05/2016 02:29 PM, Shane Lawrence wrote: >> I'm experiencing a strange issue with pacemaker. It is unable to check >> the status of a systemd resource. >> >> systemctl shows that the service crashed: >> [root@xx ~]# systemctl status rsyslog >> ● rsyslog.service - System Logging Service >>Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; >> vendor preset: enabled) >>Active: inactive (dead) since Mon 2016-12-05 07:41:11 UTC; 12h ago >> Docs: man:rsyslogd(8) >>http://www.rsyslog.com/doc/ >> Main PID: 22703 (code=exited, status=0/SUCCESS) >> >> Dec 02 21:41:41 xx...xx systemd[1]: Starting Cluster >> Controlled rsyslog... >> Dec 02 21:41:41 xx...xx systemd[1]: Started Cluster >> Controlled rsyslog. >> Dec 05 07:41:08 xx...xx systemd[1]: Stopping System >> Logging Service... >> Dec 05 07:41:11 xx...xx systemd[1]: Stopped System >> Logging Service. >> Dec 05 07:41:40 xx...xx systemd[1]: Stopped System >> Logging Service. >> >> Attempting to view the status through Pacemaker shows: >> [root@xx ~]# crm_resource --force-check -V -r rsyslog >> Error performing operation: Argument list too long >> [root@xx ~]# pcs resource debug-monitor rsyslog --full >> Error performing operation: Argument list too long >> >> The problem seems to be resolved (temporarily) by restarting corosync >> and then starting the cluster again. >> >> Has anyone else experienced this? > > That is odd behavior. You may want to open a bug report at > bugs.clusterlabs.org and attach your configuration and logs. > > On Linux, the system error number for "Argument list too long" is the > same as the OCF monitor status "Not running", so I suspect that it's a > display issue rather than an actual error, but I'm not sure. > > Then the question would just be why is rsyslog stopping. Even more that "Cluster Controlled rsyslog has been started" while "System Logging Service" is being stopped. Could it be a result of a namespace/daemon/service clash of some kind? -- Jan (Poki) pgpWpae_0qyG9.pgp Description: PGP signature ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Error performing operation: Argument list too long
On 12/05/2016 02:29 PM, Shane Lawrence wrote: > I'm experiencing a strange issue with pacemaker. It is unable to check > the status of a systemd resource. > > systemctl shows that the service crashed: > [root@xx ~]# systemctl status rsyslog > ● rsyslog.service - System Logging Service >Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; > vendor preset: enabled) >Active: inactive (dead) since Mon 2016-12-05 07:41:11 UTC; 12h ago > Docs: man:rsyslogd(8) >http://www.rsyslog.com/doc/ > Main PID: 22703 (code=exited, status=0/SUCCESS) > > Dec 02 21:41:41 xx...xx systemd[1]: Starting Cluster > Controlled rsyslog... > Dec 02 21:41:41 xx...xx systemd[1]: Started Cluster > Controlled rsyslog. > Dec 05 07:41:08 xx...xx systemd[1]: Stopping System > Logging Service... > Dec 05 07:41:11 xx...xx systemd[1]: Stopped System > Logging Service. > Dec 05 07:41:40 xx...xx systemd[1]: Stopped System > Logging Service. > > Attempting to view the status through Pacemaker shows: > [root@xx ~]# crm_resource --force-check -V -r rsyslog > Error performing operation: Argument list too long > [root@xx ~]# pcs resource debug-monitor rsyslog --full > Error performing operation: Argument list too long > > The problem seems to be resolved (temporarily) by restarting corosync > and then starting the cluster again. > > Has anyone else experienced this? That is odd behavior. You may want to open a bug report at bugs.clusterlabs.org and attach your configuration and logs. On Linux, the system error number for "Argument list too long" is the same as the OCF monitor status "Not running", so I suspect that it's a display issue rather than an actual error, but I'm not sure. Then the question would just be why is rsyslog stopping. ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Antw: [pacemaker+ clvm] Cluster lvm must be active exclusively to create snapshot
Yes! I have tried it and succeeded. Clustered LVM can take snapshot only on exclusively lock mode. 2016-12-06 16:23 GMT+08:00 Ulrich Windl : > >>> su liu schrieb am 06.12.2016 um 02:16 in > Nachricht > : > > *Hi all,* > > > > > > *I am new to pacemaker and I have some questions about the clvmd + > > pacemaker + corosync. I wish you could explain it for me if you are free. > > thank you very much!* > > *I have 2 nodes and the pacemaker's status is as follows:* > > > > [root@controller ~]# pcs status --full > > Cluster name: mycluster > > Last updated: Mon Dec 5 18:15:12 2016Last change: Fri Dec 2 > > 15:01:03 2016 by root via cibadmin on compute1 > > Stack: corosync > > Current DC: compute1 (2) (version 1.1.13-10.el7_2.4-44eb2dd) - partition > > with quorum > > 2 nodes and 4 resources configured > > > > Online: [ compute1 (2) controller (1) ] > > > > Full list of resources: > > > > Clone Set: dlm-clone [dlm] > > dlm(ocf::pacemaker:controld):Started compute1 > > dlm(ocf::pacemaker:controld):Started controller > > Started: [ compute1 controller ] > > Clone Set: clvmd-clone [clvmd] > > clvmd(ocf::heartbeat:clvm):Started compute1 > > clvmd(ocf::heartbeat:clvm):Started controller > > Started: [ compute1 controller ] > > > > Node Attributes: > > * Node compute1 (2): > > * Node controller (1): > > > > Migration Summary: > > * Node compute1 (2): > > * Node controller (1): > > > > PCSD Status: > > controller: Online > > compute1: Online > > > > Daemon Status: > > corosync: active/disabled > > pacemaker: active/disabled > > pcsd: active/enabled > > > > > > > > *I create a lvm on controller node and it can be seen on the compute1 > > node immediately with 'lvs' command. but the lvm it not activate on > > compute1.* > > *then i want to create a snapshot of the lvm, but failed with the error > > message:* > > > > > > > > *### volume-4fad87bb-3d4c-4a96-bef1-8799980050d1 must be active > exclusively > > to create snapshot ###* > > *Can someone tell me how to snapshot a lvm in the cluster lvm > environment? > > thank you very much。* > > Did you try "vgchange -a e ..."? > > > > > > > Additional information: > > > > [root@controller ~]# vgdisplay > > --- Volume group --- > > VG Name cinder-volumes > > System ID > > Formatlvm2 > > Metadata Areas1 > > Metadata Sequence No 19 > > VG Access read/write > > VG Status resizable > > Clustered yes > > Sharedno > > MAX LV0 > > Cur LV1 > > Open LV 0 > > Max PV0 > > Cur PV1 > > Act PV1 > > VG Size 1000.00 GiB > > PE Size 4.00 MiB > > Total PE 255999 > > Alloc PE / Size 256 / 1.00 GiB > > Free PE / Size 255743 / 999.00 GiB > > VG UUID aLamHi-mMcI-2NsC-Spjm-QWZr-MzHx-pPYSTt > > > > [root@controller ~]# rpm -qa |grep pacem > > pacemaker-cli-1.1.13-10.el7_2.4.x86_64 > > pacemaker-libs-1.1.13-10.el7_2.4.x86_64 > > pacemaker-1.1.13-10.el7_2.4.x86_64 > > pacemaker-cluster-libs-1.1.13-10.el7_2.4.x86_64 > > > > > > [root@controller ~]# lvs > > LV VG Attr > > LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert > > volume-1b0ea468-37c8-4b47-a6fa-6cce65b068b5 cinder-volumes -wi-a- > > 1.00g > > > > > > [root@compute1 ~]# lvs > > LV VG Attr > > LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert > > volume-1b0ea468-37c8-4b47-a6fa-6cce65b068b5 cinder-volumes -wi--- > > 1.00g > > > > > > thank you very much! > > > > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] How to DRBD + Pacemaker + Samba in Active/Passive Cluster?
Hi, like Ulrich already said, you don't need CTDB for "active/passive" Samba, we use standard LSB RA for Samba init script without problems since many years. Best, Stan 2016-12-05 10:16 GMT+01:00 Semion Itic : > Hello Everybody, > > How to DRBD + Pacemaker + Samba in Active/Passive Cluster? > > I have been searching now about many days how to integrate drbd + pacemaker > and corosync in a two node active/passive cluster (with Service IP) with > SAMBA. And I still don’t unterstand how to to go further after mounting the > filesystem, I want to integrate Samba in the pacemaker process as a service. > I saw that the main solution to this, is using CTDB, but this seems to be > very complex for me. So, do anybody have experience in this combination of > topics and can provide me with a Instruction or at least an advice. > > Regards, > Simon I > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Antw: [pacemaker+ clvm] Cluster lvm must be active exclusively to create snapshot
>>> su liu schrieb am 06.12.2016 um 02:16 in Nachricht : > *Hi all,* > > > *I am new to pacemaker and I have some questions about the clvmd + > pacemaker + corosync. I wish you could explain it for me if you are free. > thank you very much!* > *I have 2 nodes and the pacemaker's status is as follows:* > > [root@controller ~]# pcs status --full > Cluster name: mycluster > Last updated: Mon Dec 5 18:15:12 2016Last change: Fri Dec 2 > 15:01:03 2016 by root via cibadmin on compute1 > Stack: corosync > Current DC: compute1 (2) (version 1.1.13-10.el7_2.4-44eb2dd) - partition > with quorum > 2 nodes and 4 resources configured > > Online: [ compute1 (2) controller (1) ] > > Full list of resources: > > Clone Set: dlm-clone [dlm] > dlm(ocf::pacemaker:controld):Started compute1 > dlm(ocf::pacemaker:controld):Started controller > Started: [ compute1 controller ] > Clone Set: clvmd-clone [clvmd] > clvmd(ocf::heartbeat:clvm):Started compute1 > clvmd(ocf::heartbeat:clvm):Started controller > Started: [ compute1 controller ] > > Node Attributes: > * Node compute1 (2): > * Node controller (1): > > Migration Summary: > * Node compute1 (2): > * Node controller (1): > > PCSD Status: > controller: Online > compute1: Online > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > > > > *I create a lvm on controller node and it can be seen on the compute1 > node immediately with 'lvs' command. but the lvm it not activate on > compute1.* > *then i want to create a snapshot of the lvm, but failed with the error > message:* > > > > *### volume-4fad87bb-3d4c-4a96-bef1-8799980050d1 must be active exclusively > to create snapshot ###* > *Can someone tell me how to snapshot a lvm in the cluster lvm environment? > thank you very much。* Did you try "vgchange -a e ..."? > > > Additional information: > > [root@controller ~]# vgdisplay > --- Volume group --- > VG Name cinder-volumes > System ID > Formatlvm2 > Metadata Areas1 > Metadata Sequence No 19 > VG Access read/write > VG Status resizable > Clustered yes > Sharedno > MAX LV0 > Cur LV1 > Open LV 0 > Max PV0 > Cur PV1 > Act PV1 > VG Size 1000.00 GiB > PE Size 4.00 MiB > Total PE 255999 > Alloc PE / Size 256 / 1.00 GiB > Free PE / Size 255743 / 999.00 GiB > VG UUID aLamHi-mMcI-2NsC-Spjm-QWZr-MzHx-pPYSTt > > [root@controller ~]# rpm -qa |grep pacem > pacemaker-cli-1.1.13-10.el7_2.4.x86_64 > pacemaker-libs-1.1.13-10.el7_2.4.x86_64 > pacemaker-1.1.13-10.el7_2.4.x86_64 > pacemaker-cluster-libs-1.1.13-10.el7_2.4.x86_64 > > > [root@controller ~]# lvs > LV VG Attr > LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert > volume-1b0ea468-37c8-4b47-a6fa-6cce65b068b5 cinder-volumes -wi-a- > 1.00g > > > [root@compute1 ~]# lvs > LV VG Attr > LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert > volume-1b0ea468-37c8-4b47-a6fa-6cce65b068b5 cinder-volumes -wi--- > 1.00g > > > thank you very much! ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Antw: How to DRBD + Pacemaker + Samba in Active/Passive Cluster?
>>> Semion Itic schrieb am 05.12.2016 um 10:16 in Nachricht <7bdd2f4d-7c0a-49b1-b42e-9668befbb...@outlook.de>: > Hello Everybody, > > How to DRBD + Pacemaker + Samba in Active/Passive Cluster? > > I have been searching now about many days how to integrate drbd + pacemaker > and corosync in a two node active/passive cluster (with Service IP) with > SAMBA. And I still don’t unterstand how to to go further after mounting the > filesystem, I want to integrate Samba in the pacemaker process as a service. > I saw that the main solution to this, is using CTDB, but this seems to be > very complex for me. So, do anybody have experience in this combination of > topics and can provide me with a Instruction or at least an advice. Hi! Active/passive suggests you want samba on one node (and thus DRBD in master/slave configuration). CTDB is not complex, but CTDB is an active/active configuration (AFAIK) and you'd need DRBD in dual-master configuration as well as a cluster filesystem (for a lock file only, BTW). CTDB replicates the configuration to the local nodes. Ulrich > > Regards, > Simon I ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org