Hi,

my first idea would be to fix binnetaddr. It should be the networkaddress not the machines network address.

Regards
Fabian

On 02/27/2014 03:42 PM, TRIBOLET Thomas wrote:
Hello,

Before starting, my first language is French so I'll try to do my best to 
explain my problem in English.


1)      The situation :

I have 2 servers on 2 distant site.

I need to run openvpn with the same configuration on the 2 servers.
But it must run only on one server at a time.

I want that it start on the second server when the connection with internet is 
lost on the first node.

I use debian with corosync and pacemaker.

Here is the config :


A)     Corosync.conf :
compatibility: whitetank
totem {
         version: 2
         token: 3000
         token_retransmits_before_loss_const: 10
         join: 240
         consensus: 3600
         vsftype: none
         max_messages: 20
         clear_node_high_bit: yes
         secauth: off
         threads: 0
         nodeid: 1111
         rrp_mode: none
         interface {
                 member {
                         memberaddr: 172.16.135.9
                 }
                 member {
                         memberaddr: 172.16.64.248
                 }
                 ringnumber: 0
                 bindnetaddr: 172.16.135.9
                 mcastport: 5405
         }
         transport: udpu
}
amf {
         mode: disabled
}
service {
         ver:       0
         name:      pacemaker
}
aisexec {
         user:   root
         group:  root
}
logging {
         fileline: off
         to_stderr: yes
         to_logfile: yes
         logfile: /var/log/corosync/corosync.log
         to_syslog: yes
         syslog_facility: daemon
         debug: off
         timestamp: on
         logger_subsys {
                 subsys: AMF
                 debug: off
                 tags: enter|leave|trace1|trace2|trace3|trace4|trace6
         }
}

B)      Pacemaker :
node controle-col
node vpn-air
primitive ClusterMon ocf:pacemaker:ClusterMon \
         params user="root" update="30" extra_options="-E 
/root/PacemakerMailScript.sh -h /tmp/ClusterMon.html" \
         op monitor on-fail="restart" interval="60"
primitive openvpn lsb:openvpn \
         op monitor interval="30s"
primitive p_ping ocf:pacemaker:ping \
         params host_list="8.8.8.8 4.2.2.2" multiplier="100" dampen="5s" \
         op monitor interval="60" timeout="60" \
         op start interval="0" timeout="60" \
         op stop interval="0" timeout="60"
clone ClusterMon-clone ClusterMon
clone c_ping p_ping
location OpenVpnCluster openvpn \
         rule $id="OpenVpnCluster-rule" -inf: not_defined pingd or pingd lte 0
location PrefVpnAir openvpn \
         rule $id="PrefVpnAir-rule" 50: #uname eq vpn-air
property $id="cib-bootstrap-options" \
         dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
         cluster-infrastructure="openais" \
         expected-quorum-votes="2" \
         stonith-enabled="false" \
         no-quorum-policy="ignore"


C)      Running good crm_mon
============
Last updated: Thu Feb 27 14:54:31 2014
Last change: Wed Jan 15 12:51:35 2014 via crmd on controle-col
Stack: openais
Current DC: controle-col - partition with quorum
Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
2 Nodes configured, 2 expected votes
5 Resources configured.
============

Online: [ vpn-air controle-col ]

Clone Set: c_ping [p_ping]
      Started: [ controle-col vpn-air ]
openvpn (lsb:openvpn):  Started vpn-air
Clone Set: ClusterMon-clone [ClusterMon]
      Started: [ controle-col vpn-air ]


2)      My problem :

When there is a network problem :

Ex :
a) first-node site lost internet connection ( and communication with 
second-node at same time due to vpn on internet connection )
b) cluster stop openvpn on first node and launch it on second due to primitive 
p_ping in config.
c) connection come back on first-node site
d) Problem : first-node and second-node don't bring back cluster, the don't see 
each other and create a cluster on each node -> split brain I think.
e) Each node has openvpn running which shouldn't happen


I don't have stonith running because I think without quorum it will be 
problematic
Is there a way to say to corosync to recreate a ring ?

Or have someone another solution ?

Thanks


Tribolet Thomas
ISSeP (Institut Scientifique de Service Public)
th.tribo...@issep.be<mailto:th.tribo...@issep.be>
+32 (0) 4229 83 46

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to