[ClusterLabs] ubsubscribe
___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Installing on SLES 12 -- Where's the Repos?
Hello Eric, You could test it for free, you just need to register to https://scc.suse.com/login After that, you have an access for 60 days to SLES Repo. And for the HA repo, it's here : https://www.suse.com/products/highavailability/download/ Matthieu 2017-06-16 9:21 GMT+02:00 Eric Robinson <eric.robin...@psmnv.com>: > We’ve been a Red Hat/CentOS shop for 10+ years and have installed > Corosync+Pacemaker+DRBD dozens of times using the repositories, all for > free. > > > > We are now trying out our first SLES 12 server, and I’m looking for the > repos. Where the heck are they? I went looking, and all I can find is the > SLES “High Availability Extension,” which I must pay $700/year for? No > freaking way! > > > > This is Linux we’re talking about, right? There’s got to be an easy way to > install the cluster without paying for a subscription… right? > > > > Someone talk me off the ledge here. > > > > -- > > Eric Robinson > > > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > > ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Corosync - both nodes stay online
No it is not a typo... I have tried backport but the version is still 1.2.0. I think the easiest way is to upgrade my system. Thank you 2017-01-17 9:27 GMT+01:00 Jan Friesse: > Hi all, >> >> I have a two node cluster with the following details: >> - Ubuntu 10.04.4 LTS (I know its old…) >> - corosync 1.2.0 >> > > Isn't this typo? I mean, 1.2.0 is ... ancient and full of already fixed > bugs. > > > - pacemaker 1.0.8+hg15494-2ubuntu2 >> >> Following configuration is applied to corosync: >> >> totem { >> version: 2 >> token: 3000 >> token_retransmits_before_loss_const: 10 >> join: 60 >> consensus: 5000 >> vsftype: none >> max_messages: 20 >> clear_node_high_bit: yes >> secauth: off >> threads: 0 >> rrp_mode: none >> cluster_name: firewall-ha >> >> interface { >> ringnumber: 0 >> bindnetaddr: 192.168.211.1 >> broadcast : yes >> mcastport: 5405 >> ttl : 1 >> } >> >> transport: udpu >> } >> >> nodelist { >> node { >> ring0_addr: 192.168.211.1 >> name: net1 >> nodeid: 1 >> } >> node { >> ring0_addr: 192.168.211.2 >> name: net2 >> nodeid: 2 >> } >> } >> >> quorum { >> provider: corosync_votequorum >> two_node: 1 >> } >> >> amf { >> mode: disabled >> } >> >> service { >> ver: 0 >> name: pacemaker >> } >> >> aisexec { >> user: root >> group: root >> } >> >> logging { >> fileline: off >> to_stderr: yes >> to_logfile: yes >> to_syslog: yes >> logfile: /var/log/corosync/corosync.log >> syslog_facility: daemon >> debug: off >> timestamp: on >> logger_subsys { >> subsys: AMF >> debug: off >> tags: enter|leave|trace1|trace2|trace3|trace4|trace6 >> } >> } >> >> > Actually config file most likely doesn't work as you expected. Like a > nodelist - this is 2.x concept and unsupported by 1.x. Same applies to > corosync_votequorum. Transport udpu is not implemented in 1.2.0 (it was > added in 1.3.0). > > I would recommend to use some backports repo and upgrade. > > Regards, > Honza > > Here is an output of crm status after starting coro sync on both nodes: >> >> Last updated: Mon Jan 16 21:24:18 2017 >> Stack: openais >> Current DC: net1 - partition with quorum >> Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd >> 2 Nodes configured, 2 expected votes >> 0 Resources configured. >> >> >> Online: [ net1 net2 ] >> >> Now if I kill net2 with: >> killall -9 corosync >> >> The primary host don’t « see » anything, the cluster still appear to be >> online on net1: >> >> Last updated: Mon Jan 16 21:25:25 2017 >> Stack: openais >> Current DC: net1 - partition with quorum >> Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd >> 2 Nodes configured, 2 expected votes >> 0 Resources configured. >> >> >> Online: [ net1 net2 ] >> >> I just see this part in the logs: >> Jan 16 21:35:21 corosync [TOTEM ] A processor failed, forming new >> configuration. >> >> And then, when I start corosync on net2, cluster stays offline: >> >> Last updated: Mon Jan 16 21:38:13 2017 >> Stack: openais >> Current DC: NONE >> 2 Nodes configured, 2 expected votes >> 0 Resources configured. >> >> >> OFFLINE: [ net1 net2 ] >> >> I have to kill corosync on both nodes, and start on both node together to >> get back online. >> >> >> When the two nodes are up, I can see trafic with tcpdump: >> 21:41:49.653780 IP 192.168.211.1.5404 > 255.255.255.255.5405: UDP, length >> 82 >> 21:41:49.678846 IP 192.168.211.1.5404 > 192.168.211.2.5405: UDP, length 70 >> 21:41:49.680339 IP 192.168.211.2.5404 > 192.168.211.1.5405: UDP, length 70 >> 21:41:49.889424 IP 192.168.211.1.5404 > 255.255.255.255.5405: UDP, length >> 82 >> 21:41:49.910492 IP 192.168.211.1.5404 > 192.168.211.2.5405: UDP, length 70 >> 21:41:49.911990 IP 192.168.211.2.5404 > 192.168.211.1.5405: UDP, length 70 >> >> Here is the output of the state of the ring on net1: >> corosync-cfgtool -s >> Printing ring status. >> Local node ID 30648512 >> RING ID 0 >> id = 192.168.211.1 >> status = ring 0 active with no faults >> >> And net2: >> Printing ring status. >> Local node ID 47425728 >> RING ID 0 >> id = 192.168.211.2 >> status = ring 0 active with no faults >> >> Here is the log on net1 when I start the cluster on both nodes: >> Jan 16 21:41:52 net1 crmd: [15288]: info: crm_timer_popped: Election >> Trigger (I_DC_TIMEOUT) just popped! >> Jan 16 21:41:52 net1 crmd: [15288]: WARN: do_log: FSA:
[ClusterLabs] Corosync - both nodes stay online
Hi all, I have a two node cluster with the following details: - Ubuntu 10.04.4 LTS (I know its old…) - corosync 1.2.0 - pacemaker 1.0.8+hg15494-2ubuntu2 Following configuration is applied to corosync: totem { version: 2 token: 3000 token_retransmits_before_loss_const: 10 join: 60 consensus: 5000 vsftype: none max_messages: 20 clear_node_high_bit: yes secauth: off threads: 0 rrp_mode: none cluster_name: firewall-ha interface { ringnumber: 0 bindnetaddr: 192.168.211.1 broadcast : yes mcastport: 5405 ttl : 1 } transport: udpu } nodelist { node { ring0_addr: 192.168.211.1 name: net1 nodeid: 1 } node { ring0_addr: 192.168.211.2 name: net2 nodeid: 2 } } quorum { provider: corosync_votequorum two_node: 1 } amf { mode: disabled } service { ver: 0 name: pacemaker } aisexec { user: root group: root } logging { fileline: off to_stderr: yes to_logfile: yes to_syslog: yes logfile: /var/log/corosync/corosync.log syslog_facility: daemon debug: off timestamp: on logger_subsys { subsys: AMF debug: off tags: enter|leave|trace1|trace2|trace3|trace4|trace6 } } Here is an output of crm status after starting coro sync on both nodes: Last updated: Mon Jan 16 21:24:18 2017 Stack: openais Current DC: net1 - partition with quorum Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd 2 Nodes configured, 2 expected votes 0 Resources configured. Online: [ net1 net2 ] Now if I kill net2 with: killall -9 corosync The primary host don’t « see » anything, the cluster still appear to be online on net1: Last updated: Mon Jan 16 21:25:25 2017 Stack: openais Current DC: net1 - partition with quorum Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd 2 Nodes configured, 2 expected votes 0 Resources configured. Online: [ net1 net2 ] I just see this part in the logs: Jan 16 21:35:21 corosync [TOTEM ] A processor failed, forming new configuration. And then, when I start corosync on net2, cluster stays offline: Last updated: Mon Jan 16 21:38:13 2017 Stack: openais Current DC: NONE 2 Nodes configured, 2 expected votes 0 Resources configured. OFFLINE: [ net1 net2 ] I have to kill corosync on both nodes, and start on both node together to get back online. When the two nodes are up, I can see trafic with tcpdump: 21:41:49.653780 IP 192.168.211.1.5404 > 255.255.255.255.5405: UDP, length 82 21:41:49.678846 IP 192.168.211.1.5404 > 192.168.211.2.5405: UDP, length 70 21:41:49.680339 IP 192.168.211.2.5404 > 192.168.211.1.5405: UDP, length 70 21:41:49.889424 IP 192.168.211.1.5404 > 255.255.255.255.5405: UDP, length 82 21:41:49.910492 IP 192.168.211.1.5404 > 192.168.211.2.5405: UDP, length 70 21:41:49.911990 IP 192.168.211.2.5404 > 192.168.211.1.5405: UDP, length 70 Here is the output of the state of the ring on net1: corosync-cfgtool -s Printing ring status. Local node ID 30648512 RING ID 0 id = 192.168.211.1 status = ring 0 active with no faults And net2: Printing ring status. Local node ID 47425728 RING ID 0 id = 192.168.211.2 status = ring 0 active with no faults Here is the log on net1 when I start the cluster on both nodes: Jan 16 21:41:52 net1 crmd: [15288]: info: crm_timer_popped: Election Trigger (I_DC_TIMEOUT) just popped! Jan 16 21:41:52 net1 crmd: [15288]: WARN: do_log: FSA: Input I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING Jan 16 21:41:52 net1 crmd: [15288]: info: do_state_transition: State transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_timer_popped ] Jan 16 21:41:52 net1 crmd: [15288]: info: do_state_transition: State transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_FSA_INTERNAL origin=do_election_check ] Jan 16 21:41:52 net1 crmd: [15288]: info: do_te_control: Registering TE UUID: 53d7e000-3468-4548-b9f9-5bdb9ac9bfc7 Jan 16 21:41:52 net1 crmd: [15288]: WARN: cib_client_add_notify_callback: Callback already present Jan 16 21:41:52 net1 crmd: [15288]: info: set_graph_functions: Setting custom graph functions Jan 16 21:41:52 net1 crmd: [15288]: info: unpack_graph: Unpacked transition -1: 0 actions in 0 synapses Jan 16 21:41:52 net1 crmd: [15288]: info: do_dc_takeover: Taking over DC status for this partition Jan 16 21:41:52 net1 cib: [15284]: info: cib_process_readwrite: We are now in R/W mode Jan 16 21:41:52 net1 cib: [15284]: info: cib_process_request: Operation complete: op