Something seems very wrong with this at the corosync level. Even fenced and the dlm are having issues.
Jan: Could this be firewall related? On 27 Sep 2013, at 10:44 pm, Bartłomiej Wójcik <bartlomiej.woj...@turbineam.com> wrote: > W dniu 2013-09-27 04:26, Andrew Beekhof pisze: >> On 26/09/2013, at 8:35 PM, Bartłomiej Wójcik >> <bartlomiej.woj...@turbineam.com> >> wrote: >> >> >>> Hello, >>> >>> I install Pacemaker in accordance with >>> http://clusterlabs.org/quickstart-ubuntu.html >>> on Ubuntu 13.04 two nodes changing only the IP addresses. >>> >>> /etc/cluster/cluster.conf: >>> >>> <?xml version="1.0"?> >>> <cluster config_version="1" name="pacemaker1"> >>> <logging debug="off"/> >>> <clusternodes> >>> <clusternode name="fmpgpool4" nodeid="1"> >>> <fence> >>> <method name="pcmk-redirect"> >>> <device name="pcmk" port="fmpgpool4"/> >>> </method> >>> </fence> >>> </clusternode> >>> <clusternode name="fmpgpool5" nodeid="2"> >>> <fence> >>> <method name="pcmk-redirect"> >>> <device name="pcmk" port="fmpgpool5"/> >>> </method> >>> </fence> >>> </clusternode> >>> </clusternodes> >>> <fencedevices> >>> <fencedevice name="pcmk" agent="fence_pcmk"/> >>> </fencedevices> >>> </cluster> >>> >>> >>> gets only the server: >>> ps -ef|grep pacemaker >>> >>> >>> pacemakerd >>> >> What do the logs from pacemakerd say? >> >> >>> >>> and nothing more >>> >>> >>> I try to do: >>> crm configure property stonith-enabled=false >>> >>> and gets: >>> Signon to CIB failed: connection failed >>> Init failed, could not perform requested operations >>> ERROR: cannot parse xml: no element found: line 1, column 0 >>> ERROR: No CIB! >>> >>> >>> I don't know what could be wrong. >>> >>> >>> Regards! >>> >>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: >>> Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> >>> Project Home: >>> http://www.clusterlabs.org >>> >>> Getting started: >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> >>> Bugs: >>> http://bugs.clusterlabs.org >> >> >> _______________________________________________ >> Pacemaker mailing list: >> Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> >> Project Home: >> http://www.clusterlabs.org >> >> Getting started: >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> >> Bugs: >> http://bugs.clusterlabs.org > > Hello, > > corosync.log: > > Sep 26 11:14:50 corosync [MAIN ] Corosync Cluster Engine ('1.4.4'): started > and ready to provide service. > Sep 26 11:14:50 corosync [MAIN ] Corosync built-in features: nss > Sep 26 11:14:50 corosync [MAIN ] Successfully read config from > /etc/cluster/cluster.conf > Sep 26 11:14:50 corosync [MAIN ] Successfully parsed cman config > Sep 26 11:14:50 corosync [MAIN ] Successfully configured openais services to > load > Sep 26 11:14:50 corosync [TOTEM ] Initializing transport (UDP/IP Multicast). > Sep 26 11:14:50 corosync [TOTEM ] Initializing transmit/receive security: > libtomcrypt SOBER128/SHA1HMAC (mode 0). > Sep 26 11:14:50 corosync [TOTEM ] The network interface [10.0.0.34] is now up. > Sep 26 11:14:50 corosync [QUORUM] Using quorum provider quorum_cman > Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync cluster > quorum service v0.1 > Sep 26 11:14:50 corosync [CMAN ] CMAN 3.1.8 (built Jan 17 2013 06:24:33) > started > Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync CMAN > membership service 2.90 > Sep 26 11:14:50 corosync [SERV ] Service engine loaded: openais cluster > membership service B.01.01 > Sep 26 11:14:50 corosync [SERV ] Service engine loaded: openais event > service B.01.01 > Sep 26 11:14:50 corosync [SERV ] Service engine loaded: openais checkpoint > service B.01.01 > Sep 26 11:14:50 corosync [SERV ] Service engine loaded: openais message > service B.03.01 > Sep 26 11:14:50 corosync [SERV ] Service engine loaded: openais distributed > locking service B.03.01 > Sep 26 11:14:50 corosync [SERV ] Service engine loaded: openais timer > service A.01.01 > Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync extended > virtual synchrony service > Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync > configuration service > Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync cluster > closed process group service v1.01 > Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync cluster > config database access v1.01 > Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync profile > loading service > Sep 26 11:14:50 corosync [QUORUM] Using quorum provider quorum_cman > Sep 26 11:14:50 corosync [SERV ] Service engine loaded: corosync cluster > quorum service v0.1 > Sep 26 11:14:56 corosync [CLM ] Members Left: > Sep 26 11:14:56 corosync [CLM ] Members Joined: > Sep 26 11:14:56 corosync [CLM ] r(0) ip(10.0.0.35) > Sep 26 11:14:56 corosync [TOTEM ] A processor joined or left the membership > and a new membership was formed. > Set r/w permissions for uid=108, gid=0 on /var/log/cluster/corosync.log > Sep 26 11:15:31 fmpgpool4 pacemakerd: [15471]: info: crm_log_init_worker: > Changed active directory to /var/lib/heartbeat/cores/root > Sep 26 11:15:31 fmpgpool4 pacemakerd: [15471]: notice: main: Starting > Pacemaker 1.1.7 (Build: ee0730e13d124c3d58f00016c3376a1de5323cff): > generated-manpages agent-manpages ncurses hear > tbeat corosync-plugin cman snmp libesmtp > Sep 26 11:15:31 fmpgpool4 pacemakerd: [15471]: info: main: Maximum core file > size is: 18446744073709551615 > Sep 26 11:23:16 fmpgpool4 pacemakerd: [15471]: ERROR: cluster_connect_cpg: > Could not join the CPG group 'pacemakerd': 6 > Sep 26 11:23:16 fmpgpool4 pacemakerd: [15471]: ERROR: main: Couldn't connect > to Corosync's CPG service > Set r/w permissions for uid=108, gid=0 on /var/log/cluster/corosync.log > Sep 26 11:27:30 fmpgpool4 pacemakerd: [15803]: info: crm_log_init_worker: > Changed active directory to /var/lib/heartbeat/cores/root > Sep 26 11:27:30 fmpgpool4 pacemakerd: [15803]: notice: main: Starting > Pacemaker 1.1.7 (Build: ee0730e13d124c3d58f00016c3376a1de5323cff): > generated-manpages agent-manpages ncurses hear > tbeat corosync-plugin cman snmp libesmtp > Sep 26 11:27:30 fmpgpool4 pacemakerd: [15803]: info: main: Maximum core file > size is: 18446744073709551615 > Sep 26 11:35:15 fmpgpool4 pacemakerd: [15803]: ERROR: cluster_connect_cpg: > Could not join the CPG group 'pacemakerd': 6 > Sep 26 11:35:15 fmpgpool4 pacemakerd: [15803]: ERROR: main: Couldn't connect > to Corosync's CPG service > Set r/w permissions for uid=108, gid=0 on /var/log/cluster/corosync.log > > dlm_controld.log: > > Sep 26 11:14:54 dlm_controld dlm_controld 3.1.8 started > Sep 26 11:15:04 dlm_controld daemon cpg_join error retrying > Sep 26 11:15:14 dlm_controld daemon cpg_join error retrying > Sep 26 11:15:24 dlm_controld daemon cpg_join error retrying > Sep 26 11:15:34 dlm_controld daemon cpg_join error retrying > Sep 26 11:15:44 dlm_controld daemon cpg_join error retrying > Sep 26 11:15:54 dlm_controld daemon cpg_join error retrying > Sep 26 11:16:04 dlm_controld daemon cpg_join error retrying > Sep 26 11:16:14 dlm_controld daemon cpg_join error retrying > Sep 26 11:16:24 dlm_controld daemon cpg_join error retrying > and so on... > > fenced.log > > Sep 26 11:14:54 fenced fenced 3.1.8 started > Sep 26 11:15:04 fenced daemon cpg_join error retrying > Sep 26 11:15:14 fenced daemon cpg_join error retrying > Sep 26 11:15:24 fenced daemon cpg_join error retrying > Sep 26 11:15:34 fenced daemon cpg_join error retrying > Sep 26 11:15:44 fenced daemon cpg_join error retrying > Sep 26 11:15:54 fenced daemon cpg_join error retrying > Sep 26 11:16:04 fenced daemon cpg_join error retrying > Sep 26 11:16:14 fenced daemon cpg_join error retrying > and so on... > > > Regards! > > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org