hi,
now my cluster config is:
node host1
node host2
node host3
primitive FileServer ocf:heartbeat:VirtualDomain \
params config="/etc/libvirt/qemu/FileServerserver.xml"
hypervisor="qemu:///system" migration_transport="ssh" \
op start interval="0" timeout="90s" \
op stop interval="0" timeout="90s" \
op monitor interval="30" timeout="60s" \
op migrate_from interval="0" timeout="120" \
op migrate_to interval="0" timeout="120" \
meta migration-threshold="3" target-role="Stopped"
allow-migrate="true" is-managed="true"
primitive Iscsi lsb:open-iscsi \
operations $id="Iscsi-operation" \
op start interval="0" timeout="15s" \
op stop interval="0" timeout="15s" \
op monitor interval="30s" timeout="15s"
primitive PingSan ocf:pacemaker:ping \
params name="pingd-san" host_list="192.168.1.3" multiplier="100" \
op monitor interval="10s" timeout="60s" \
op start interval="0" timeout="60s" \
op stop interval="0" timeout="60s"
primitive Virsh lsb:libvirt-bin \
operations $id="Virsh-operation" \
op start interval="0" timeout="15s" \
op stop interval="0" timeout="15s" \
op monitor interval="30s" timeout="15s"
group Service Iscsi Virsh
clone PingSanClone PingSan \
meta globally-unique="false" interleave="true" target-role="Started"
clone ServiceClone Service \
meta globally-unique="false" interleave="true" target-role="Started"
location ServiceCloneLocation ServiceClone \
rule $id="ServiceCloneOnConnectedSan" -inf: not_defined pingd-san or
pingd-san lte 0
colocation B inf: FileServerServer ServiceClone
order A inf: ServiceClone:start FileServerServer
property $id="cib-bootstrap-options" \
stonith-enabled="false" \
no-quorum-policy="ignore" \
dc-version="1.0.11-6e010d6b0d49a6b929d17c0114e9d2d934dc8e04" \
cluster-infrastructure="openais" \
expected-quorum-votes="3" \
start-failure-is-fatal="false" \
stop-orphan-resources="false" \
stop-orphan-actions="false"
rsc_defaults $id="rsc-options" \
resource-stickiness="200"
after I start cluster I have that:
Online: [ host1 host2 host3 ]
Clone Set: ServiceClone
Started: [ host1 host2 host3 ]
Clone Set: PingSanClone
Started: [ host1 host2 host3 ]
File (ocf::heartbeat:VirtualDomain) Started (unmanaged) FAILED[ host2
host3 host1 ]
Migration summary:
* Node host2:
FileServer: migration-threshold=3 fail-count=1000000 last-failure='Thu Aug
25 20:24:28 2011'
* Node host3:
FileServer: migration-threshold=3 fail-count=1000000 last-failure='Thu Aug
25 20:24:28 2011'
* Node host1:
FileServer: migration-threshold=3 fail-count=1000000 last-failure='Thu Aug
25 20:24:28 2011'
Failed actions:
FileServer_monitor_0 (node=host2, call=25, rc=1, status=complete): unknown
error
FileServer_stop_0 (node=host2, call=26, rc=1, status=complete): unknown
error
FileServer_monitor_0 (node=host3, call=25, rc=1, status=complete): unknown
error
FileServer_stop_0 (node=host3, call=26, rc=1, status=complete): unknown
error
FileServer_monitor_0 (node=host1, call=25, rc=1, status=complete): unknown
error
FileServer_stop_0 (node=host1, call=26, rc=1, status=complete): unknown
error
in log after a cleanup I found that:
ug 25 20:24:28 host1 lrmd: [12314]: debug: lrmd_rsc_destroy: removing resource
File
Aug 25 20:24:28 host1 lrmd: [12314]: debug: on_msg_add_rsc:client [12317] adds
resource File
Aug 25 20:24:28 host1 lrmd: [12314]: debug: on_msg_perform_op:2385: copying
parameters for rsc File
Aug 25 20:24:28 host1 lrmd: [12314]: debug: on_msg_perform_op: add an
operation operation monitor[25] on ocf::VirtualDomain::FileServer for client
12317, its parameters: crm_feature_set=[3.0.1]
config=[/etc/libvirt/qemu/FileServerserver.xml] migration_transport=[ssh]
hypervisor=[qemu:///system] CRM_meta_timeout=[60000] to the operation list.
Aug 25 20:24:28 host1 lrmd: [12314]: info: rsc:FileServer:25: probe
Aug 25 20:24:28 host1 lrmd: [28405]: debug: perform_ra_op: resetting scheduler
class to SCHED_OTHER
Aug 25 20:24:28 host1 lrmd: [12314]: WARN: Managed FileServer:monitor process
28405
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
exited with return code 1.
^^^^^^^^^^^^^^^^^^^^^^^^
I have done a debug of VirtualDomain but seems ok, is thesre someone that can
help me in this trouble?
I think the problem is in the interaction between the cluster, and libvirt
thanks to all
Umberto
Il giovedì 11 agosto 2011 08:04:36 Andrew Beekhof ha scritto:
> On Wed, Aug 10, 2011 at 11:15 PM, Maloja01 <[email protected]> wrote:
> > The order constraints do work as I assume, but I guess that
> > you run into a pifall:
> >
> > A clone is marked as "up", if one instance in the cluster is started
> > successfully. The order does not say, that the clone on the same node
> > must be up.
>
> Use a colocation constraint to have that
>
> > Kind regards
> > Fabian
> >
> > On 08/10/2011 01:43 PM, [email protected] wrote:
> >> hi,
> >> excuse me for my poor english, i use google to help me in traslation....
> >> and I am a newbie in clustering :-).
> >>
> >> I'm trying to start a cluster with tree nodes for virtualizzation, I
> >> have used a how-to that I found at
> >> http://www.linbit.com/support/ha-kvm.pdf to configure the cluster,
> >> volumes of vm are shared on openFileServerServerServerServerr cluster on
iscsi that works well.
> >>
> >> vm start ok in hosts if I'm out of the cluster.
> >>
> >> The problem is that the vm start before libvirt and open-iscsi initiator
> >> I have set a order rule but seems wont work.
> >> after when services are started the cluster can not restart the machine
> >>
> >>
> >> so the output of crm_mon -1 is
> >> ============
> >> Last updated: Wed Aug 10 12:40:20 2011
> >> Stack: openais
> >> Current DC: host1 - partition with quorum
> >> Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
> >> 3 Nodes configured, 3 expected votes
> >> 2 Resources configured.
> >> ============
> >>
> >> Online: [ host1 host2 host3 ]
> >>
> >> Clone Set: BackEndClone
> >> Started: [ host1 host2 host3 ]
> >> Samba (ocf::heartbeat:VirtualDomain) Started [ host1 host2
> >> host3 ]
> >>
> >> Failed actions:
> >> Samba_monitor_0 (node=host1, call=15, rc=1, status=complete):
> >> unknown error
> >> Samba_stop_0 (node=host1, call=16, rc=1, status=complete): unknown
> >> error Samba_monitor_0 (node=host2, call=12, rc=1, status=complete):
> >> unknown error
> >> Samba_stop_0 (node=host2, call=13, rc=1, status=complete): unknown
> >> error Samba_monitor_0 (node=host3, call=12, rc=1, status=complete):
> >> unknown error
> >> Samba_stop_0 (node=host3, call=13, rc=1, status=complete): unknown
> >> error
> >>
> >>
> >>
> >>
> >> this is my cluster config:
> >>
> >> root@host1:~# crm configure show
> >> node host1 \
> >> attributes standby="on"
> >> node host2 \
> >> attributes standby="on"
> >> node host3 \
> >> attributes standby="on"
> >> primitive Iscsi lsb:open-iscsi \
> >> op monitor interval="30"
> >> primitive Samba ocf:heartbeat:VirtualDomain \
> >> params config="/etc/libvirt/qemu/samba.iso.xml" \
> >> meta allow-migrate="true" \
> >> op monitor interval="30"
> >> primitive Virsh lsb:libvirt-bin \
> >> op monitor interval="30"
> >> group BackEnd Iscsi Virsh
> >> clone BackEndClone BackEnd \
> >> meta target-role="Started"
> >> colocation SambaOnBackEndClone inf: Samba BackEndClone
> >> order SambaBeforeBackEndClone inf: BackEndClone Samba
> >> property $id="cib-bootstrap-options" \
> >> dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \
> >> cluster-infrastructure="openais" \
> >> expected-quorum-votes="3" \
> >> stonith-enabled="false" \
> >> no-quorum-policy="ignore" \
> >> default-action-timeout="100" \
> >> last-lrm-refresh="1312970592"
> >> rsc_defaults $id="rsc-options" \
> >> resource-stickiness="200"
> >>
> >> my log is:
> >>
> >> Aug 10 13:36:34 host1 pengine: [1923]: info: get_failcount: Samba has
> >> failed INFINITY times on host1
> >> Aug 10 13:36:34 host1 pengine: [1923]: WARN: common_apply_stickiness:
> >> Forcing Samba away from host1 after 1000000 failures (max=1000000)
> >> Aug 10 13:36:34 host1 pengine: [1923]: info: get_failcount: Samba has
> >> failed INFINITY times on host2
> >> Aug 10 13:36:34 host1 pengine: [1923]: WARN: common_apply_stickiness:
> >> Forcing Samba away from host2 after 1000000 failures (max=1000000)
> >> Aug 10 13:36:34 host1 pengine: [1923]: info: get_failcount: Samba has
> >> failed INFINITY times on host3
> >> Aug 10 13:36:34 host1 pengine: [1923]: WARN: common_apply_stickiness:
> >> Forcing Samba away from host3 after 1000000 failures (max=1000000)
> >> Aug 10 13:36:34 host1 pengine: [1923]: info: native_merge_weights:
> >> BackEndClone: Rolling back scores from Samba
> >> Aug 10 13:36:34 host1 pengine: [1923]: info: native_color: Unmanaged
> >> resource Samba allocated to 'nowhere': failed
> >> Aug 10 13:36:34 host1 pengine: [1923]: WARN: native_create_actions:
> >> Attempting recovery of resource Samba
> >> Aug 10 13:36:34 host1 pengine: [1923]: notice: LogActions: Leave
> >> resource Iscsi:0 (Started host1)
> >> Aug 10 13:36:34 host1 pengine: [1923]: notice: LogActions: Leave
> >> resource Virsh:0 (Started host1)
> >> Aug 10 13:36:34 host1 pengine: [1923]: notice: LogActions: Leave
> >> resource Iscsi:1 (Started host2)
> >> Aug 10 13:36:34 host1 pengine: [1923]: notice: LogActions: Leave
> >> resource Virsh:1 (Started host2)
> >> Aug 10 13:36:34 host1 pengine: [1923]: notice: LogActions: Leave
> >> resource Iscsi:2 (Started host3)
> >> Aug 10 13:36:34 host1 pengine: [1923]: notice: LogActions: Leave
> >> resource Virsh:2 (Started host3)
> >> Aug 10 13:36:34 host1 pengine: [1923]: notice: LogActions: Leave
> >> resource Samba (Started unmanaged)
> >
> > _______________________________________________
> > Linux-HA mailing list
> > [email protected]
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems