[ClusterLabs] multiple drives looks like balancing but why and causing troubles
I have a two node cluster. Both nodes are virtual and have five shared drives attached via sas controller. For some reason, the cluster shows both nodes have half the drives started on them. Not sure if this is called split brain or not. It definitely looks load balancing. But I did not set up load balancing. On my client, I only see the data for the shares on the active cluster node. But they should all be on the active cluster node. Any suggestions as to why this is happening? Is there a setting so that everything works on only one node at a time? pcs cluster status: Cluster name: CNAS Last updated: Wed Aug 26 13:35:47 2015 Last change: Wed Aug 26 13:28:55 2015 Stack: classic openais (with plugin) Current DC: nas02 - partition with quorum Version: 1.1.11-97629de 2 Nodes configured, 2 expected votes 11 Resources configured Online: [ nas01 nas02 ] Full list of resources: NAS(ocf::heartbeat:IPaddr2): Started nas01 Resource Group: datag datashare (ocf::heartbeat:Filesystem):Started nas02 dataserver (ocf::heartbeat:nfsserver): Started nas02 Resource Group: oomtlg oomtlshare (ocf::heartbeat:Filesystem):Started nas01 oomtlserver(ocf::heartbeat:nfsserver): Started nas01 Resource Group: oomtrg oomtrshare (ocf::heartbeat:Filesystem):Started nas02 oomtrserver(ocf::heartbeat:nfsserver): Started as02 Resource Group: oomblg oomblshare (ocf::heartbeat:Filesystem):Started nas01 oomblserver(ocf::heartbeat:nfsserver): Started nas01 Resource Group: oombrg oombrshare (ocf::heartbeat:Filesystem):Started nas02 oombrserver(ocf::heartbeat:nfsserver): Started nas02 pcs config show: Cluster Name: CNAS Corosync Nodes: nas01 nas02 Pacemaker Nodes: nas01 nas02 Resources: Resource: NAS (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=192.168.56.110 cidr_netmask=24 Operations: start interval=0s timeout=20s (NAS-start-timeout-20s) stop interval=0s timeout=20s (NAS-stop-timeout-20s) monitor interval=10s timeout=20s (NAS-monitor-interval-10s) Group: datag Resource: datashare (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/sdb1 directory=/data fstype=ext4 Operations: start interval=0s timeout=60 (datashare-start-timeout-60) stop interval=0s timeout=60 (datashare-stop-timeout-60) monitor interval=20 timeout=40 (datashare-monitor-interval-20) Resource: dataserver (class=ocf provider=heartbeat type=nfsserver) Attributes: nfs_shared_infodir=/data/nfsinfo nfs_no_notify=true Operations: start interval=0s timeout=40 (dataserver-start-timeout-40) stop interval=0s timeout=20s (dataserver-stop-timeout-20s) monitor interval=10 timeout=20s (dataserver-monitor-interval-10) Group: oomtlg Resource: oomtlshare (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/sdc1 directory=/oomtl fstype=ext4 Operations: start interval=0s timeout=60 (oomtlshare-start-timeout-60) stop interval=0s timeout=60 (oomtlshare-stop-timeout-60) monitor interval=20 timeout=40 (oomtlshare-monitor-interval-20) Resource: oomtlserver (class=ocf provider=heartbeat type=nfsserver) Attributes: nfs_shared_infodir=/oomtl/nfsinfo nfs_no_notify=true Operations: start interval=0s timeout=40 (oomtlserver-start-timeout-40) stop interval=0s timeout=20s (oomtlserver-stop-timeout-20s) monitor interval=10 timeout=20s (oomtlserver-monitor-interval-10) Group: oomtrg Resource: oomtrshare (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/sdd1 directory=/oomtr fstype=ext4 Operations: start interval=0s timeout=60 (oomtrshare-start-timeout-60) stop interval=0s timeout=60 (oomtrshare-stop-timeout-60) monitor interval=20 timeout=40 (oomtrshare-monitor-interval-20) Resource: oomtrserver (class=ocf provider=heartbeat type=nfsserver) Attributes: nfs_shared_infodir=/oomtr/nfsinfo nfs_no_notify=true Operations: start interval=0s timeout=40 (oomtrserver-start-timeout-40) stop interval=0s timeout=20s (oomtrserver-stop-timeout-20s) monitor interval=10 timeout=20s (oomtrserver-monitor-interval-10) Group: oomblg Resource: oomblshare (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/sde1 directory=/oombl fstype=ext4 Operations: start interval=0s timeout=60 (oomblshare-start-timeout-60) stop interval=0s timeout=60 (oomblshare-stop-timeout-60) monitor interval=20 timeout=40 (oomblshare-monitor-interval-20) Resource: oomblserver (class=ocf provider=heartbeat type=nfsserver) Attributes: nfs_shared_infodir=/oombl/nfsinfo nfs_no_notify=true Operations: start interval=0s timeout=40 (oomblserver-start-timeout-40) stop interval=0s timeout=20s
Re: [ClusterLabs] upgrade from 1.1.9 to 1.1.12 fails to start
I created a whole new virtual and installed everything with the new version and pacemaker wouldn't start. I have not yet learned how to use the logs yet to see what they have to say. No, I did not upgrade corosync. I am running the latest which will work with rhel6. When I tried later versions, they failed and I was told it was because we are not running rhel7. I am getting the feeling this version of Pacemaker does not work on rhel6 either. Do you believe this is the case? Or is there some configuration that needs to be done between 1.1.9 and 1.1.12? Michelle Streeter ASC2 MCS - SDE/ACL/SDL/EDL OKC Software Engineer The Boeing Company ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] upgrade from 1.1.9 to 1.1.12 fails to start
I was recommended to upgrade from 1.1.9 to 1.1.12. I had to uninstall the 1.1.9 version to install the 1.1.12 version I am not allowed to connect to a repo and so I have to download the rpms and install them individually. After I installed pacemaker-lib, cli, cluster-lib, and pacemaker itself, when I rebooted, the cluster failed to start When I tried to manually start it, I got Starting Pacemaker Cluster Manager/etc/init.d/pacemaker: line 94: 8219 Segmentation fault (core dumped) $prog /dev/null 21 I deleted the Cluster.conf file and the cib.xml and all the back up versions and tried again and got the same error. I googled this error and really got nothing. Any ideas? Michelle Streeter ASC2 MCS - SDE/ACL/SDL/EDL OKC Software Engineer The Boeing Company ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] nfsServer Filesystem Failover average 76s
I am getting an average failover for nfs of 76s. I have set all the start and stop settings to 10s but no change. The Web page is instant but not nfs. I am running two node cluster on rhel6 with pacemaker 1.1.9 Surely these times are not right? Any suggestions? Resources: Group: nfsgroup Resource: nfsshare (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/sdb1 directory=/data fstype=ext4 Operations: start interval=0s (nfsshare-start-interval-0s) stop interval=0s (nfsshare-stop-interval-0s) monitor interval=10s (nfsshare-monitor-interval-10s) Resource: nfsServer (class=ocf provider=heartbeat type=nfsserver) Attributes: nfs_shared_infodir=/data/nfsinfo nfs_no_notify=true Operations: start interval=0s timeout=10s (nfsServer-start-timeout-10s) stop interval=0s timeout=10s (nfsServer-stop-timeout-10s) monitor interval=10 timeout=20s (nfsServer-monitor-interval-10) Resource: NAS (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=192.168.56.110 cidr_netmask=24 Operations: start interval=0s timeout=20s (NAS-start-timeout-20s) stop interval=0s timeout=20s (NAS-stop-timeout-20s) monitor interval=10s timeout=20s (NAS-monitor-interval-10s) Michelle Streeter ASC2 MCS - SDE/ACL/SDL/EDL OKC Software Engineer The Boeing Company ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] NFS - Non Mirroring
Oops and Clarification: I was wrong about the iscsi. Its Serial over scsi or SAS. From: Streeter, Michelle N Sent: Tuesday, August 04, 2015 10:39 AM To: 'users@clusterlabs.org' Subject: NFS - Non Mirroring Our current configuration is using rhel 6 ha which uses the cluster.conf and nfsclient. I am testing the newer upgrade to pacemaker. Current Test is Pacemaker 1.1.9 I am still constrained to rhel 6.6 We have a two node system. Both nodes are pointing to the same drives via iscsi. And so we are not mirroring. I see all kinds of info on NFSserver and DRBD but this is for mirroring. Saw some stuff using exportfs but the nfs-kernel-? Seemed to be an older implementation So, with that said, which implementation should I use to implement my NFS system? BTW, I am currently testing using VMBox. Two nodes which I am mounting the NFS from the san-ish virtual; these are the cluster nodes, one san-ish which uses NFS server, one rhel 6 with gui to test the cluster. Michelle Streeter ASC2 MCS - SDE/ACL/SDL/EDL OKC Software Engineer The Boeing Company ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Cluster.conf to cib.xml
The previous administrator setup our cluster using the cman version and the cluster.conf file. I we want to upgrade to pacemaker 1.9 on our rhel 6.6 and so I have been testing this upgrade. But I noticed that the cib.xml does not reflect what we have in the cluster.conf file. Is there a tool or something to port the configuration to the cib? Michelle Streeter ASC2 MCS - SDE/ACL/SDL/EDL OKC Software Engineer The Boeing Company ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org