[ClusterLabs] ubsubscribe

2024-02-12 Thread Matthieu

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Installing on SLES 12 -- Where's the Repos?

2017-06-16 Thread Matthieu Fatrez
Hello Eric,

You could test it for free, you just need to register to
https://scc.suse.com/login
After that, you have an access for 60 days to SLES Repo.

And for the HA repo, it's here :
https://www.suse.com/products/highavailability/download/

Matthieu

2017-06-16 9:21 GMT+02:00 Eric Robinson <eric.robin...@psmnv.com>:

> We’ve been a Red Hat/CentOS shop for 10+ years and have installed
> Corosync+Pacemaker+DRBD dozens of times using the repositories, all for
> free.
>
>
>
> We are now trying out our first SLES 12 server, and I’m looking for the
> repos. Where the heck are they? I went looking, and all I can find is the
> SLES “High Availability Extension,” which I must pay $700/year for? No
> freaking way!
>
>
>
> This is Linux we’re talking about, right? There’s got to be an easy way to
> install the cluster without paying for a subscription… right?
>
>
>
> Someone talk me off the ledge here.
>
>
>
> --
>
> Eric Robinson
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Corosync - both nodes stay online

2017-01-17 Thread matthieu flye sainte marie
No it is not a typo... I have tried backport but the version is still 1.2.0.

I think the easiest way is to upgrade my system.

Thank you

2017-01-17 9:27 GMT+01:00 Jan Friesse :

> Hi all,
>>
>> I have a two node cluster with the following details:
>> - Ubuntu 10.04.4 LTS (I know its old…)
>> - corosync 1.2.0
>>
>
> Isn't this typo? I mean, 1.2.0 is ... ancient and full of already fixed
> bugs.
>
>
> - pacemaker 1.0.8+hg15494-2ubuntu2
>>
>> Following configuration is applied to corosync:
>>
>> totem {
>>  version: 2
>>  token: 3000
>>  token_retransmits_before_loss_const: 10
>>  join: 60
>>  consensus: 5000
>>  vsftype: none
>>  max_messages: 20
>>  clear_node_high_bit: yes
>>  secauth: off
>>  threads: 0
>>  rrp_mode: none
>>  cluster_name: firewall-ha
>>
>>  interface {
>>  ringnumber: 0
>>  bindnetaddr: 192.168.211.1
>>  broadcast : yes
>>  mcastport: 5405
>>  ttl : 1
>>  }
>>
>>  transport: udpu
>> }
>>
>> nodelist {
>>  node {
>>  ring0_addr: 192.168.211.1
>>  name: net1
>>  nodeid: 1
>>  }
>>  node {
>>  ring0_addr: 192.168.211.2
>>  name: net2
>>  nodeid: 2
>>  }
>> }
>>
>> quorum {
>>  provider: corosync_votequorum
>>  two_node: 1
>> }
>>
>> amf {
>>  mode: disabled
>> }
>>
>> service {
>>  ver:   0
>>  name:  pacemaker
>> }
>>
>> aisexec {
>>  user:   root
>>  group:  root
>> }
>>
>> logging {
>>  fileline: off
>>  to_stderr: yes
>>  to_logfile: yes
>>  to_syslog: yes
>>  logfile: /var/log/corosync/corosync.log
>> syslog_facility: daemon
>>  debug: off
>>  timestamp: on
>>  logger_subsys {
>>  subsys: AMF
>>  debug: off
>>  tags: enter|leave|trace1|trace2|trace3|trace4|trace6
>>  }
>> }
>>
>>
> Actually config file most likely doesn't work as you expected. Like a
> nodelist - this is 2.x concept and unsupported by 1.x. Same applies to
> corosync_votequorum. Transport udpu is not implemented in 1.2.0 (it was
> added in 1.3.0).
>
> I would recommend to use some backports repo and upgrade.
>
> Regards,
>   Honza
>
> Here is an output of crm status after starting coro sync on both nodes:
>> 
>> Last updated: Mon Jan 16 21:24:18 2017
>> Stack: openais
>> Current DC: net1 - partition with quorum
>> Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
>> 2 Nodes configured, 2 expected votes
>> 0 Resources configured.
>> 
>>
>> Online: [ net1 net2 ]
>>
>> Now if I kill net2 with:
>> killall -9 corosync
>>
>> The primary host don’t « see » anything, the cluster still appear to be
>> online on net1:
>> 
>> Last updated: Mon Jan 16 21:25:25 2017
>> Stack: openais
>> Current DC: net1 - partition with quorum
>> Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
>> 2 Nodes configured, 2 expected votes
>> 0 Resources configured.
>> 
>>
>> Online: [ net1 net2 ]
>>
>> I just see this part in the logs:
>> Jan 16 21:35:21 corosync [TOTEM ] A processor failed, forming new
>> configuration.
>>
>> And then, when I start corosync on net2, cluster stays offline:
>> 
>> Last updated: Mon Jan 16 21:38:13 2017
>> Stack: openais
>> Current DC: NONE
>> 2 Nodes configured, 2 expected votes
>> 0 Resources configured.
>> 
>>
>> OFFLINE: [ net1 net2 ]
>>
>> I have to kill corosync on both nodes, and start on both node together to
>> get back online.
>>
>>
>> When the two nodes are up, I can see trafic with tcpdump:
>> 21:41:49.653780 IP 192.168.211.1.5404 > 255.255.255.255.5405: UDP, length
>> 82
>> 21:41:49.678846 IP 192.168.211.1.5404 > 192.168.211.2.5405: UDP, length 70
>> 21:41:49.680339 IP 192.168.211.2.5404 > 192.168.211.1.5405: UDP, length 70
>> 21:41:49.889424 IP 192.168.211.1.5404 > 255.255.255.255.5405: UDP, length
>> 82
>> 21:41:49.910492 IP 192.168.211.1.5404 > 192.168.211.2.5405: UDP, length 70
>> 21:41:49.911990 IP 192.168.211.2.5404 > 192.168.211.1.5405: UDP, length 70
>>
>> Here is the output of the state of the ring on net1:
>> corosync-cfgtool -s
>> Printing ring status.
>> Local node ID 30648512
>> RING ID 0
>> id  = 192.168.211.1
>> status  = ring 0 active with no faults
>>
>> And net2:
>> Printing ring status.
>> Local node ID 47425728
>> RING ID 0
>> id  = 192.168.211.2
>> status  = ring 0 active with no faults
>>
>> Here is the log on net1 when I start the cluster on both nodes:
>> Jan 16 21:41:52 net1 crmd: [15288]: info: crm_timer_popped: Election
>> Trigger (I_DC_TIMEOUT) just popped!
>> Jan 16 21:41:52 net1 crmd: [15288]: WARN: do_log: FSA: 

[ClusterLabs] Corosync - both nodes stay online

2017-01-16 Thread Matthieu
Hi all,

I have a two node cluster with the following details:
- Ubuntu 10.04.4 LTS (I know its old…)
- corosync 1.2.0
- pacemaker 1.0.8+hg15494-2ubuntu2

Following configuration is applied to corosync:

totem {
version: 2
token: 3000
token_retransmits_before_loss_const: 10
join: 60
consensus: 5000
vsftype: none
max_messages: 20
clear_node_high_bit: yes
secauth: off
threads: 0
rrp_mode: none
cluster_name: firewall-ha

interface {
ringnumber: 0
bindnetaddr: 192.168.211.1
broadcast : yes
mcastport: 5405
ttl : 1
}

transport: udpu
}

nodelist {
node {
ring0_addr: 192.168.211.1
name: net1
nodeid: 1
}
node {
ring0_addr: 192.168.211.2
name: net2
nodeid: 2
}
}

quorum {
provider: corosync_votequorum
two_node: 1
}

amf {
mode: disabled
}

service {
ver:   0
name:  pacemaker
}

aisexec {
user:   root
group:  root
}

logging {
fileline: off
to_stderr: yes
to_logfile: yes
to_syslog: yes
logfile: /var/log/corosync/corosync.log
syslog_facility: daemon
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
tags: enter|leave|trace1|trace2|trace3|trace4|trace6
}
}

Here is an output of crm status after starting coro sync on both nodes:

Last updated: Mon Jan 16 21:24:18 2017
Stack: openais
Current DC: net1 - partition with quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 2 expected votes
0 Resources configured.


Online: [ net1 net2 ]

Now if I kill net2 with:
killall -9 corosync

The primary host don’t « see » anything, the cluster still appear to be online 
on net1:

Last updated: Mon Jan 16 21:25:25 2017
Stack: openais
Current DC: net1 - partition with quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 2 expected votes
0 Resources configured.


Online: [ net1 net2 ]

I just see this part in the logs:
Jan 16 21:35:21 corosync [TOTEM ] A processor failed, forming new configuration.

And then, when I start corosync on net2, cluster stays offline:

Last updated: Mon Jan 16 21:38:13 2017
Stack: openais
Current DC: NONE
2 Nodes configured, 2 expected votes
0 Resources configured.


OFFLINE: [ net1 net2 ]

I have to kill corosync on both nodes, and start on both node together to get 
back online.


When the two nodes are up, I can see trafic with tcpdump:
21:41:49.653780 IP 192.168.211.1.5404 > 255.255.255.255.5405: UDP, length 82
21:41:49.678846 IP 192.168.211.1.5404 > 192.168.211.2.5405: UDP, length 70
21:41:49.680339 IP 192.168.211.2.5404 > 192.168.211.1.5405: UDP, length 70
21:41:49.889424 IP 192.168.211.1.5404 > 255.255.255.255.5405: UDP, length 82
21:41:49.910492 IP 192.168.211.1.5404 > 192.168.211.2.5405: UDP, length 70
21:41:49.911990 IP 192.168.211.2.5404 > 192.168.211.1.5405: UDP, length 70

Here is the output of the state of the ring on net1:
corosync-cfgtool -s
Printing ring status.
Local node ID 30648512
RING ID 0
id  = 192.168.211.1
status  = ring 0 active with no faults

And net2:
Printing ring status.
Local node ID 47425728
RING ID 0
id  = 192.168.211.2
status  = ring 0 active with no faults

Here is the log on net1 when I start the cluster on both nodes:
Jan 16 21:41:52 net1 crmd: [15288]: info: crm_timer_popped: Election Trigger 
(I_DC_TIMEOUT) just popped!
Jan 16 21:41:52 net1 crmd: [15288]: WARN: do_log: FSA: Input I_DC_TIMEOUT from 
crm_timer_popped() received in state S_PENDING
Jan 16 21:41:52 net1 crmd: [15288]: info: do_state_transition: State transition 
S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT cause=C_TIMER_POPPED 
origin=crm_timer_popped ]
Jan 16 21:41:52 net1 crmd: [15288]: info: do_state_transition: State transition 
S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_FSA_INTERNAL 
origin=do_election_check ]
Jan 16 21:41:52 net1 crmd: [15288]: info: do_te_control: Registering TE UUID: 
53d7e000-3468-4548-b9f9-5bdb9ac9bfc7
Jan 16 21:41:52 net1 crmd: [15288]: WARN: cib_client_add_notify_callback: 
Callback already present
Jan 16 21:41:52 net1 crmd: [15288]: info: set_graph_functions: Setting custom 
graph functions
Jan 16 21:41:52 net1 crmd: [15288]: info: unpack_graph: Unpacked transition -1: 
0 actions in 0 synapses
Jan 16 21:41:52 net1 crmd: [15288]: info: do_dc_takeover: Taking over DC status 
for this partition
Jan 16 21:41:52 net1 cib: [15284]: info: cib_process_readwrite: We are now in 
R/W mode
Jan 16 21:41:52 net1 cib: [15284]: info: cib_process_request: Operation 
complete: op