Re: [Gluster-users] GlusterFS, Pacemaker, OCF resource agents on CentOS 7

Jiffin Tony Thottan Fri, 08 Dec 2017 02:18:07 -0800

Hi,

Can u please explain for what purpose pacemaker cluster used here?


Regards,

Jiffin


On Thursday 07 December 2017 06:59 PM, Tomalak Geret'kal wrote:

Hi guys
I'm wondering if anyone here is using the GlusterFS OCF resourceagents with Pacemaker on CentOS 7?
yum install centos-release-gluster
yum install glusterfs-server glusterfs-resource-agents
The reason I ask is that there seem to be a few problems with them on3.10, but these problems are so severe that I'm struggling to believeI'm not just doing something wrong.
I created my brick (on a volume previously used for DRBD, thus its name):

mkfs.xfs /dev/cl/lv_drbd -f
mkdir -p /gluster/test_brick
mount -t xfs /dev/cl/lv_drbd /gluster

And then my volume (enabling clients to mount it via NFS):

systemctl start glusterd
gluster volume create logs replica 2 transport tcppcmk01-drbd:/gluster/test_brick pcmk02-drbd:/gluster/test_brick
gluster volume start test_logs
gluster volume set test_logs nfs.disable off

And here's where the fun starts.
Firstly, we need to work around bug 1233344* (which was closed when3.7 went end-of-life but still seems valid in 3.10):
sed -i's#voldir="/etc/glusterd/vols/${OCF_RESKEY_volname}"#voldir="/var/lib/glusterd/vols/${OCF_RESKEY_volname}"#'/usr/lib/ocf/resource.d/glusterfs/volume
With that done, I [attempt to] stop GlusterFS so it can be broughtunder Pacemaker control:
systemctl stop glusterfsd
systemctl stop glusterd
umount /gluster
(I usually have to manually kill glusterfs processes at this pointbefore the unmount works - why does the systemctl stop not do it?)
With the node in standby (just one is online in this example, butanother is configured), I then set up the resources:
pcs node standby
pcs resource create gluster_data ocf:heartbeat:Filesystemdevice="/dev/cl/lv_drbd" directory="/gluster" fstype="xfs"
pcs resource create glusterd ocf:glusterfs:glusterd
pcs resource create gluster_vol ocf:glusterfs:volume volname="test_logs"
pcs resource create test_logs ocf:heartbeat:Filesystem \
    device="localhost:/test_logs" directory="/var/log/test" fstype="nfs" \
options="vers=3,tcp,nolock,context=system_u:object_r:httpd_sys_content_t:s0"\
    op monitor OCF_CHECK_LEVEL="20"
pcs resource clone glusterd
pcs resource clone gluster_data
pcs resource clone gluster_vol ordered=true
pcs constraint order start gluster_data-clone then start glusterd-clone
pcs constraint order start glusterd-clone then start gluster_vol-clone
pcs constraint order start gluster_vol-clone then start test_logs
pcs constraint colocation add test_logs with FloatingIp INFINITY
(note the SELinux wrangling - this is because I have a CGI webapplication which will later need to read files from the /var/log/testmount)
At this point, even with the node in standby, it's /already/ failing:

[root@pcmk01 ~]# pcs status
Cluster name: test_cluster
Stack: corosync
Current DC: pcmk01-cr (version 1.1.15-11.el7_3.5-e174ec8) - partitionWITHOUT quorumLast updated: Thu Dec 7 13:20:41 2017 Last change: Thu Dec 7 13:09:33 2017 by root via crm_attribute on pcmk01-cr
2 nodes and 13 resources configured

Online: [ pcmk01-cr ]
OFFLINE: [ pcmk02-cr ]

Full list of resources:

 FloatingIp     (ocf::heartbeat:IPaddr2):       Started pcmk01-cr
 test_logs      (ocf::heartbeat:Filesystem):    Stopped
 Clone Set: glusterd-clone [glusterd]
     Stopped: [ pcmk01-cr pcmk02-cr ]
 Clone Set: gluster_data-clone [gluster_data]
     Stopped: [ pcmk01-cr pcmk02-cr ]
 Clone Set: gluster_vol-clone [gluster_vol]
gluster_vol (ocf::glusterfs:volume): FAILED pcmk01-cr(blocked)
     Stopped: [ pcmk02-cr ]

Failed Actions:
* gluster_data_start_0 on pcmk01-cr 'not configured' (6): call=72,status=complete, exitreason='DANGER! xfs on /dev/cl/lv_drbd is NOTcluster-aware!',
    last-rc-change='Thu Dec  7 13:09:28 2017', queued=0ms, exec=250ms
* gluster_vol_stop_0 on pcmk01-cr 'unknown error' (1): call=60,status=Timed Out, exitreason='none',
    last-rc-change='Thu Dec  7 12:55:11 2017', queued=0ms, exec=20004ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

1. The data mount can't be created? Why?
2. Why is there a volume "stop" command being attempted, and why doesit fail?3. Why is any of this happening in standby? I can't have the resourcesfailing before I've even made the node live! I could understand why agluster_vol start operation would fail when glusterd is (correctly)stopped, but why is there a *stop* operation? And why does that makethe resource "blocked"?
Given the above steps, is there something fundamental I'm missingabout how these resource agents should be used? How do *you* configureGlusterFS on Pacemaker?
Any advice appreciated.

Best regards


* https://bugzilla.redhat.com/show_bug.cgi?id=1233344




_______________________________________________
Gluster-users mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
[email protected]
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS, Pacemaker, OCF resource agents on CentOS 7

Reply via email to