Happy new year!

Can you please re-post this to the clusterlabs - users list?

http://clusterlabs.org/mailman/listinfo/users

This list is being phased out.

digimer

On 05/01/16 04:34 AM, InterNetworX | Michael Rößler wrote:
> Happy new year list,
> 
> I have here a test environment for checking pacemaker. Sometimes our
> kvm-hosts with libvirt have trouble with responding the stonith/libvirt
> resource. Pacemaker should work like zabbix, for example, that after 3
> failed monitoring attemps a service should regarded as failed. That's
> why I was searching for a configuration  here:
> 
> 
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/index.html
> 
> 
> But I failed after hours.
> 
> That's the configuration line for stonith/libvirt:
> 
> crm configure primitive p_fence_ha3 stonith:external/libvirt  params
> hostlist="ha3" hypervisor_uri="qemu+tls://debian1/system" op monitor
> interval="60"
> 
> Every 60 seconds pacemaker makes something like this:
> 
>  stonith -t external/libvirt hostlist="ha3"
> hypervisor_uri="qemu+tls://debian1/system" -S
>  ok
> 
> To simulate the unavailability of the kvm host I remove the certificate
> in /etc/libvirt/libvirtd.conf and restart libvirtd. After 60 seconds or
> less I can see the error with "crm status". On the kvm host I add
> certificate again to /etc/libvirt/libvirtd.conf and restart libvirt
> again. Although libvirt is again available the stonith-resource did not
> start again.
> 
> I altered the configuration line for stonith with following parts:
> 
>  op monitor interval="60" pcmk_status_retries="3"
>  op monitor interval="60" pcmk_monitor_retries="3"
>  op monitor interval="60" start-delay=180
>  meta migration-threshold="200" failure-timeout="120"
> 
> But always with first failed monitor check after 60 or less seconds
> pacemakers stops resuming after libvirt is again available.
> 
> It follows the "crm status" on debian 8 (Jessie):
> 
>  root@ha4:~# crm status
>  Last updated: Tue Jan  5 10:04:18 2016
>  Last change: Mon Jan  4 18:18:12 2016
>  Stack: corosync
>  Current DC: ha3 (167772400) - partition with quorum
>  Version: 1.1.12-561c4cf
>  2 Nodes configured
>  2 Resources configured
>  Online: [ ha3 ha4 ]
>  Service-IP     (ocf::heartbeat:IPaddr2):       Started ha3
>  haproxy        (lsb:haproxy):  Started ha3
> 
> Kind regards
> 
> Michael R.
> _______________________________________________
> Openais mailing list
> [email protected]
> https://lists.linuxfoundation.org/mailman/listinfo/openais


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
_______________________________________________
Openais mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/openais

Reply via email to