On 12.05.2021 17:34, fatcha...@gmx.de wrote: > Hi Andrei, Hi everybody, > > >> Gesendet: Mittwoch, 12. Mai 2021 um 16:01 Uhr >> Von: fatcha...@gmx.de >> An: users@clusterlabs.org >> Betreff: Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ? >> >> Hi Andrei, Hi everybody, >> >> >>> You need order fs_database after promote operation; and as I just found >>> pacemaker also does not reverse it correctly and executes fs stop and >>> drbd demote concurrently. So you need additional order constraint to >>> first stop fs then demote drbd. >> >> is there so good doku about this, I don't know how to archive a "after >> promote operation" and how can I tell the pcs to first dismount the >> filesystem mountpoint and then demote the drbd-device. >> > ok, so I found something and used this: > > pcs constraint order stop fs_logfiles then demote drbd_logsfiles-clone > pcs constraint order stop fs_database then demote database_drbd-clone > > and it works great Thanks for the hint. > But the thing I still don't understand is why the cluster demotes is active > node for a short time when I reenable a node from standby back to unstandby ? > Is it not possible to join the drbd as secondary without demote the primary > for a short moment ?
Try adding interleave=true to your clones. > > Best regards and take care > > fatcharly > > > >> Sorry but this is new for me. >> >> Best regards and take care >> >> fatcharly >> >> >> >> >>> Gesendet: Dienstag, 11. Mai 2021 um 17:19 Uhr >>> Von: "Andrei Borzenkov" <arvidj...@gmail.com> >>> An: users@clusterlabs.org >>> Betreff: Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ? >>> >>> On 11.05.2021 17:43, fatcha...@gmx.de wrote: >>>> Hi, >>>> >>>> I'm using a CentOS 8.3.2011 with a pacemaker-2.0.4-6.el8_3.1.x86_64 + >>>> corosync-3.0.3-4.el8.x86_64 and kmod-drbd90-9.0.25-2.el8_3.elrepo.x86_64. >>>> The cluster consists of two nodes which are providing a ha-mariadb with >>>> the help of two drbd devices for the database and the logfiles. The >>>> corosync is working over two rings and both machines are virtual >>>> kvm-guests. >>>> >>>> Problem: >>>> Node susanne is the active node and lisbon is changing from standby to >>>> active, susanna is trying to demote one drbd-device but is failling to. >>>> The cluster is working on properly, but the error stays. >>>> This is the what happens: >>>> >>>> Cluster Summary: >>>> * Stack: corosync >>>> * Current DC: lisbon (version 2.0.4-6.el8_3.1-2deceaa3ae) - partition >>>> with quo rum >>>> * Last updated: Tue May 11 16:15:54 2021 >>>> * Last change: Tue May 11 16:15:42 2021 by root via cibadmin on susanne >>>> * 2 nodes configured >>>> * 11 resource instances configured >>>> >>>> Node List: >>>> * Online: [ lisbon susanne ] >>>> >>>> Active Resources: >>>> * HA_IP (ocf::heartbeat:IPaddr2): Started susanne >>>> * Clone Set: database_drbd-clone [database_drbd] (promotable): >>>> * Masters: [ susanne ] >>>> * Slaves: [ lisbon ] >>>> * Clone Set: drbd_logsfiles-clone [drbd_logsfiles] (promotable): >>>> * drbd_logsfiles (ocf::linbit:drbd): Demoting susanne >>>> * fs_logfiles (ocf::heartbeat:Filesystem): Started susanne >>> >>> Presumably fs_logfiles is located on drbd_logfiles, so how comes it is >>> active while drbd_logfiles is being demoted? Then drbdadm fails to >>> change status to secondary and RA simply loops forever until timeout. >>> >>>> * fs_database (ocf::heartbeat:Filesystem): Started susanne >>>> * mysql-server (ocf::heartbeat:mysql): Started susanne >>>> * Clone Set: ping_fw-clone [ping_fw]: >>>> * Started: [ lisbon susanne ] >>>> >>>> ------------------------------------------------------------------------------------------- >>>> after a few seconds it switches over: >>>> >>>> Cluster Summary: >>>> * Stack: corosync >>>> * Current DC: lisbon (version 2.0.4-6.el8_3.1-2deceaa3ae) - partition >>>> with quo rum >>>> * Last updated: Tue May 11 16:17:59 2021 >>>> * Last change: Tue May 11 16:15:42 2021 by root via cibadmin on susanne >>>> * 2 nodes configured >>>> * 11 resource instances configured >>>> >>>> Node List: >>>> * Online: [ lisbon susanne ] >>>> >>>> Active Resources: >>>> * HA_IP (ocf::heartbeat:IPaddr2): Started susanne >>>> * Clone Set: database_drbd-clone [database_drbd] (promotable): >>>> * Masters: [ susanne ] >>>> * Slaves: [ lisbon ] >>>> * Clone Set: drbd_logsfiles-clone [drbd_logsfiles] (promotable): >>>> * Masters: [ susanne ] >>>> * Slaves: [ lisbon ] >>>> * fs_logfiles (ocf::heartbeat:Filesystem): Started susanne >>>> * fs_database (ocf::heartbeat:Filesystem): Started susanne >>>> * mysql-server (ocf::heartbeat:mysql): Started susanne >>>> * Resource Group: apache: >>>> * httpd_srv (ocf::heartbeat:apache): Started susanne >>>> * Clone Set: ping_fw-clone [ping_fw]: >>>> * Started: [ lisbon susanne ] >>>> >>>> Failed Resource Actions: >>>> * drbd_logsfiles_demote_0 on susanne 'error' (1): call=736, >>>> status='Timed Out' >>>> , exitreason='', last-rc-change='2021-05-11 16:15:42 +02:00', queued=0ms, >>>> exec=9 0001ms >>>> ---------------------------------------------------------------------------------------------- >>>> >>> >>> And what you see in logs? >>> >>>> I think it is a constraint-problem, but I can't find it. >>>> This is my config: >>>> [root@susanne pacemaker]# pcs config show Cluster Name: mysql_cluster >>>> Corosync Nodes: >>>> susanne lisbon >>>> Pacemaker Nodes: >>>> lisbon susanne >>>> >>>> Resources: >>>> Resource: HA_IP (class=ocf provider=heartbeat type=IPaddr2) >>>> Attributes: cidr_netmask=24 ip=192.168.18.154 >>>> Operations: monitor interval=15s (HA_IP-monitor-interval-15s) >>>> start interval=0s timeout=20s (HA_IP-start-interval-0s) >>>> stop interval=0s timeout=20s (HA_IP-stop-interval-0s) >>>> Clone: database_drbd-clone >>>> Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true >>>> promoted-max=1 promoted-node-max=1 >>>> Resource: database_drbd (class=ocf provider=linbit type=drbd) >>>> Attributes: drbd_resource=drbd1 >>>> Operations: demote interval=0s timeout=90 >>>> (database_drbd-demote-interval-0s) >>>> monitor interval=20 role=Slave timeout=20 >>>> (database_drbd-monitor-interval-20) >>>> monitor interval=10 role=Master timeout=20 >>>> (database_drbd-monitor-interval-10) >>>> notify interval=0s timeout=90 >>>> (database_drbd-notify-interval-0s) >>>> promote interval=0s timeout=90 >>>> (database_drbd-promote-interval-0s) >>>> reload interval=0s timeout=30 >>>> (database_drbd-reload-interval-0s) >>>> start interval=0s timeout=240 >>>> (database_drbd-start-interval-0s) >>>> stop interval=0s timeout=100 >>>> (database_drbd-stop-interval-0s) >>>> Clone: drbd_logsfiles-clone >>>> Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true >>>> promoted-max=1 promoted-node-max=1 >>>> Resource: drbd_logsfiles (class=ocf provider=linbit type=drbd) >>>> Attributes: drbd_resource=drbd2 >>>> Operations: demote interval=0s timeout=90 >>>> (drbd_logsfiles-demote-interval-0s) >>>> monitor interval=20 role=Slave timeout=20 >>>> (drbd_logsfiles-monitor-interval-20) >>>> monitor interval=10 role=Master timeout=20 >>>> (drbd_logsfiles-monitor-interval-10) >>>> notify interval=0s timeout=90 >>>> (drbd_logsfiles-notify-interval-0s) >>>> promote interval=0s timeout=90 >>>> (drbd_logsfiles-promote-interval-0s) >>>> reload interval=0s timeout=30 >>>> (drbd_logsfiles-reload-interval-0s) >>>> start interval=0s timeout=240 >>>> (drbd_logsfiles-start-interval-0s) >>>> stop interval=0s timeout=100 >>>> (drbd_logsfiles-stop-interval-0s) >>>> Resource: fs_logfiles (class=ocf provider=heartbeat type=Filesystem) >>>> Attributes: device=/dev/drbd2 directory=/mnt/clusterfs2 fstype=ext4 >>>> Operations: monitor interval=20s timeout=40s >>>> (fs_logfiles-monitor-interval-20s) >>>> start interval=0s timeout=60s (fs_logfiles-start-interval-0s) >>>> stop interval=0s timeout=60s (fs_logfiles-stop-interval-0s) >>>> Resource: fs_database (class=ocf provider=heartbeat type=Filesystem) >>>> Attributes: device=/dev/drbd1 directory=/mnt/clusterfs1 fstype=ext4 >>>> Operations: monitor interval=20s timeout=40s >>>> (fs_database-monitor-interval-20s) >>>> start interval=0s timeout=60s (fs_database-start-interval-0s) >>>> stop interval=0s timeout=60s (fs_database-stop-interval-0s) >>>> Resource: mysql-server (class=ocf provider=heartbeat type=mysql) >>>> Attributes: additional_parameters=--bind-address=0.0.0.0 >>>> binary=/usr/bin/mysqld_safe config=/etc/my.cnf >>>> datadir=/mnt/clusterfs1/mysql pid=/var/lib/mysql/run/mariadb.pid >>>> socket=/var/lib/mysql/mysql.sock >>>> Operations: demote interval=0s timeout=120s >>>> (mysql-server-demote-interval-0s) >>>> monitor interval=20s timeout=30s >>>> (mysql-server-monitor-interval-20s) >>>> notify interval=0s timeout=90s >>>> (mysql-server-notify-interval-0s) >>>> promote interval=0s timeout=120s >>>> (mysql-server-promote-interval-0s) >>>> start interval=0s timeout=60s >>>> (mysql-server-start-interval-0s) >>>> stop interval=0s timeout=60s (mysql-server-stop-interval-0s) >>>> Group: apache >>>> Resource: httpd_srv (class=ocf provider=heartbeat type=apache) >>>> Attributes: configfile=/etc/httpd/conf/httpd.conf >>>> statusurl=http://127.0.0.1/server-status >>>> Operations: monitor interval=10s timeout=20s >>>> (httpd_srv-monitor-interval-10s) >>>> start interval=0s timeout=40s (httpd_srv-start-interval-0s) >>>> stop interval=0s timeout=60s (httpd_srv-stop-interval-0s) >>>> Clone: ping_fw-clone >>>> Resource: ping_fw (class=ocf provider=pacemaker type=ping) >>>> Attributes: dampen=10s host_list=192.168.18.1 multiplier=1000 >>>> Operations: monitor interval=10s timeout=60s >>>> (ping_fw-monitor-interval-10s) >>>> start interval=0s timeout=60s (ping_fw-start-interval-0s) >>>> stop interval=0s timeout=20s (ping_fw-stop-interval-0s) >>>> >>>> Stonith Devices: >>>> Fencing Levels: >>>> >>>> Location Constraints: >>>> Resource: mysql-server >>>> Constraint: location-mysql-server >>>> Rule: boolean-op=or score=-INFINITY (id:location-mysql-server-rule) >>>> Expression: pingd lt 1 (id:location-mysql-server-rule-expr) >>>> Expression: not_defined pingd >>>> (id:location-mysql-server-rule-expr-1) >>>> Ordering Constraints: >>>> start mysql-server then start httpd_srv (kind:Mandatory) >>>> (id:order-mysql-server-httpd_srv-mandatory) >>>> start database_drbd-clone then start drbd_logsfiles-clone >>>> (kind:Mandatory) >>>> (id:order-database_drbd-clone-drbd_logsfiles-clone-mandatory) >>>> start drbd_logsfiles-clone then start fs_database (kind:Mandatory) >>>> (id:order-drbd_logsfiles-clone-fs_database-mandatory) >>> >>> You need order fs_database after promote operation; and as I just found >>> pacemaker also does not reverse it correctly and executes fs stop and >>> drbd demote concurrently. So you need additional order constraint to >>> first stop fs then demote drbd. >>> >>>> start fs_database then start fs_logfiles (kind:Mandatory) >>>> (id:order-fs_database-fs_logfiles-mandatory) >>>> start fs_logfiles then start mysql-server (kind:Mandatory) >>>> (id:order-fs_logfiles-mysql-server-mandatory) >>>> Colocation Constraints: >>>> fs_logfiles with drbd_logsfiles-clone (score:INFINITY) >>>> (with-rsc-role:Master) >>>> (id:colocation-fs_logfiles-drbd_logsfiles-clone-INFINITY) >>>> fs_database with database_drbd-clone (score:INFINITY) >>>> (with-rsc-role:Master) >>>> (id:colocation-fs_database-database_drbd-clone-INFINITY) >>>> drbd_logsfiles-clone with database_drbd-clone (score:INFINITY) >>>> (rsc-role:Master) (with-rsc-role:Master) >>>> (id:colocation-drbd_logsfiles-clone-database_drbd-clone-INFINITY) >>>> HA_IP with database_drbd-clone (score:INFINITY) (rsc-role:Started) >>>> (with-rsc-role:Master) (id:colocation-HA_IP-database_drbd-clone-INFINITY) >>>> mysql-server with fs_database (score:INFINITY) >>>> (id:colocation-mysql-server-fs_database-INFINITY) >>>> httpd_srv with mysql-server (score:INFINITY) >>>> (id:colocation-httpd_srv-mysql-server-INFINITY) >>>> Ticket Constraints: >>>> >>>> Alerts: >>>> No alerts defined >>>> >>>> Resources Defaults: >>>> No defaults set >>>> Operations Defaults: >>>> No defaults set >>>> >>>> Cluster Properties: >>>> cluster-infrastructure: corosync >>>> cluster-name: mysql_cluster >>>> dc-version: 2.0.4-6.el8_3.1-2deceaa3ae >>>> have-watchdog: false >>>> last-lrm-refresh: 1620742514 >>>> stonith-enabled: FALSE >>>> >>>> Tags: >>>> No tags defined >>>> >>>> Quorum: >>>> Options: >>>> >>>> >>>> >>>> >>>> >>>> Any suggestions are welcome >>>> >>>> best regards stay safe, take care >>>> >>>> fatcharly >>>> >>>> _______________________________________________ >>>> Manage your subscription: >>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>> >>>> ClusterLabs home: https://www.clusterlabs.org/ >>>> >>> >>> _______________________________________________ >>> Manage your subscription: >>> https://lists.clusterlabs.org/mailman/listinfo/users >>> >>> ClusterLabs home: https://www.clusterlabs.org/ >>> >> _______________________________________________ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ >> > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/