On 02/02/17 01:43 PM, Lentes, Bernd wrote: > Hi, > > i'm implementing a two node cluster with SLES 11 SP4. I have a shared storage > (FC-SAN). > I'm planning to use clvm. > What i did already: > Connecting the SAN to the hosts > Creating a volume on the SAN > Volume is visible on both nodes (through multipath and device-mapper) > My pacemaker config looks like this: > > crm(live)# configure show > node ha-idg-1 > node ha-idg-2 > primitive prim_clvmd ocf:lvm2:clvmd \ > op stop interval=0 timeout=100 \ > op start interval=0 timeout=90 \ > op monitor interval=20 timeout=20 > primitive prim_dlm ocf:pacemaker:controld \ > op start interval=0 timeout=90 \ > op stop interval=0 timeout=100 \ > op monitor interval=60 timeout=60 > primitive prim_stonith_ilo_ha-idg-1 stonith:external/riloe \ > params ilo_hostname=SUNHB65279 hostlist=ha-idg-1 ilo_user=root > ilo_password=**** \ > op monitor interval=60m timeout=120s \ > meta target-role=Started > primitive prim_stonith_ilo_ha-idg-2 stonith:external/riloe \ > params ilo_hostname=SUNHB58820-3 hostlist=ha-idg-2 ilo_user=root > ilo_password=**** \ > op monitor interval=60m timeout=120s \ > meta target-role=Started > primitive prim_vg_cluster_01 LVM \ > params volgrpname=vg_cluster_01 \ > op monitor interval=60 timeout=60 \ > op start interval=0 timeout=30 \ > op stop interval=0 timeout=30 > group group_prim_dlm_clvmd_vg_cluster_01 prim_dlm prim_clvmd > prim_vg_cluster_01 > clone clone_group_prim_dlm_clvmd_vg_cluster_01 > group_prim_dlm_clvmd_vg_cluster_01 \ > meta target-role=Started > location loc_prim_stonith_ilo_ha-idg-1 prim_stonith_ilo_ha-idg-1 -inf: > ha-idg-1 > location loc_prim_stonith_ilo_ha-idg-2 prim_stonith_ilo_ha-idg-2 -inf: > ha-idg-2 > property cib-bootstrap-options: \ > dc-version=1.1.12-f47ea56 \ > cluster-infrastructure="classic openais (with plugin)" \ > expected-quorum-votes=2 \ > no-quorum-policy=ignore \ > last-lrm-refresh=1485872095 \ > stonith-enabled=true \ > default-resource-stickiness=100 \ > start-failure-is-fatal=true \ > is-managed-default=true \ > stop-orphan-resources=true > rsc_defaults rsc-options: \ > target-role=stopped \ > resource-stickiness=100 \ > failure-timeout=0 > op_defaults op-options: \ > on-fail=restart > > > This is the status: > crm(live)# status > Last updated: Thu Feb 2 19:14:10 2017 > Last change: Thu Feb 2 19:05:26 2017 by root via cibadmin on ha-idg-2 > Stack: classic openais (with plugin) > Current DC: ha-idg-2 - partition with quorum > Version: 1.1.12-f47ea56 > 2 Nodes configured, 2 expected votes > 8 Resources configured > > > Online: [ ha-idg-1 ha-idg-2 ] > > Clone Set: clone_group_prim_dlm_clvmd_vg_cluster_01 > [group_prim_dlm_clvmd_vg_cluster_01] > Started: [ ha-idg-1 ha-idg-2 ] > > Failed actions: > prim_stonith_ilo_ha-idg-1_start_0 on ha-idg-2 'unknown error' (1): > call=100, status=Timed Out, exit-reason='none', last-rc-change='Tue Jan 31 > 15:14:34 2017', queued=0ms, exec=20004ms > prim_stonith_ilo_ha-idg-2_start_0 on ha-idg-1 'unknown error' (1): > call=107, status=Error, exit-reason='none', last-rc-change='Tue Jan 31 > 15:14:55 2017', queued=0ms, exec=11584ms > > Until now everything is fine. The stonith resources have currently wrong > passwords for the ILO adapters. It's difficult enough to establish a > HA-cluster for the first time. > Until now i don't like to have my hosts booting all the time because of my > errors in the configuration.
If stonith is called, DLM blocks and stays blocked until it is told that stonith was successful (by design). So it is possible that a failed stonith has left DLM blocked, which would block clvmd as it uses DLM. > I created a vg and a lv, it's visible on both nodes. > My plan is to use for each vm a dedicated lv. VM's should run on both nodes, > some on nodeA, some on nodeB. > If the cluster cares about the mounting of the fs inside the lv (i'm planning > to use btrfs), i should not need a cluster fs ? Right ? It would be wiser to use the LV as the raw device for the VM, if you are creating an LV per VM anyway. btrfs (and most FSes) are not cluster aware and can only be mounted on one node or the other at a time, preventing live-migration. > Because the cluster cares that the fs is always mounted only on one node. > That's what i've been told. > I'd like to use btrfs because of its snapshot capability which is great. > Should i create now a resource group with the lv, the fs and the vm ? > I stumbled across sfex. It seems to provide an additional layer of security > concerning access to a shared storage (my lv ?). > Is it senseful, does anyone have experience with it ? > > Btw: Suse recommends > (https://www.suse.com/documentation/sle_ha/book_sleha/data/sec_ha_clvm_config.html) > to create a mirrored lv. > Is that really necessary/advisable ? My lv's reside on a SAN which is a RAID5 > configuration. I don't see the benefit and the need of a mirrored lv, > just the disadvantage of wasting disk space. Beside the RAID we have a > backup, and before changes of the vm's i will create a btrfs snapshot. > Unfortunately i'm not able to create a snapshot inside the vm because they > are running older versions of Suse which don't support btrfs. Of course i > could > recreate the vm's with a lvm configuration inside themselves. Maybe, if i > have time enough. Then i could create snapshots with lvm tools. > > Thanks. Snapshoting running VMs is not advised, in my opinion. There is no way to be sure that disk writes are flushed, or that apps like DBs are consistent. You might well find that you snapshot doesn't work when you need it most. It is much safer to use a backup program/agents that know how to put the data being backed up into a clean state. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould _______________________________________________ Users mailing list: [email protected] http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
